Currently: Senior Software Engineer • 8+ years

Kasi Viswanath G

I build production distributed systems — and I’m shifting my focus toward ML infrastructure and GPU performance engineering.

Impact highlights
  • Event-driven systems with Kafka
  • Kubernetes-based production workloads
  • Seasonal peak scaling (US sale events)
  • Airflow + BigQuery data pipelines
  • Performance tuning across SQL + async processing
Distributed SystemsPerformanceML Infra (focus)GPU efficiency (building)

Technical Foundation

8+ years building production distributed systems in fintech and high-volume e-commerce environments. Strong background in concurrency, event-driven architectures, data systems, and performance optimization.

Distributed Systems & Infrastructure
  • Designed and maintained event-driven architectures using Kafka for high-throughput order processing.
  • Deployed and operated containerized workloads on Kubernetes with scaling considerations for peak traffic.
  • Built resilient, idempotent workflows for correctness under high concurrency and retry scenarios.
  • Developed Airflow DAGs for orchestrating production data pipelines.
Performance & Data Systems
  • Optimized Azure SQL queries and indexing strategies for performance-critical workflows.
  • Improved processing throughput via partition tuning, batching, and concurrency adjustments.
  • Integrated analytics pipelines using BigQuery for operational insights.
  • Strong understanding of system bottlenecks: CPU, I/O, memory, and network constraints.
Java EcosystemKafkaKubernetesAzure SQLBigQueryAirflowDistributed SystemsConcurrencyPerformance Tuning

Current Focus: ML Infrastructure & GPU Performance

Building deeper expertise in low-level systems and GPU compute to transition into ML infrastructure and performance engineering roles. My focus is on understanding compute efficiency from the hardware layer upward — memory hierarchy, parallelism, and bottleneck analysis.

C++ & Systems Programming
  • Strengthening fundamentals in memory management, data layout, cache behavior, and multithreading.
  • Studying performance tradeoffs between abstraction and low-level control.
  • Exploring lock-free and concurrency-oriented design patterns.
CUDA & GPU Compute
  • Learning CUDA execution model: threads, warps, blocks, and grids.
  • Understanding GPU memory hierarchy: global, shared, constant, and register memory.
  • Profiling workloads to analyze memory bandwidth vs compute-bound bottlenecks.
Triton & ML Systems
  • Exploring Triton kernel development for high-performance tensor operations.
  • Studying distributed training internals (DDP, NCCL, scaling patterns).
  • Building toward reproducible benchmarking and profiling-driven optimization workflows.
C++CUDATritonGPU ProfilingParallel ComputingDistributed ML SystemsPerformance Engineering

Experience

Senior Software Engineer
Present
Top e-commerce company (US) — Order Fulfillment
KafkaKubernetesAzure SQLAirflowBigQueryPerformance
  • Built and maintained distributed order fulfillment workflows using Kafka and Kubernetes in a high-volume environment.
  • Designed scalable event-driven processing pipelines resilient to seasonal peak traffic spikes (US sale events).
  • Improved throughput and reduced processing latency via partition tuning, async batching, and concurrency optimization.
  • Developed Airflow DAGs powering analytics and operational workflows via BigQuery.
  • Optimized Azure SQL queries and indexing strategies for order state transitions and operational reporting.
  • Implemented idempotent, retry-safe patterns to ensure correctness and reliability under high concurrency.
Software Engineer
Past
Kotak Cherry (Fintech) — Digital Investment Platform
BackendReliabilityPerformanceObservabilityFintech
  • Contributed to backend services powering a digital investment platform used by a six-figure registered user base.
  • Built and optimized portfolio aggregation and transaction-related workflows.
  • Improved API performance and query efficiency for latency-sensitive user flows.
  • Strengthened observability and production monitoring to improve reliability and incident response.
Technical focus

I’m targeting ML infrastructure and performance roles, leveraging my background in distributed systems. Current focus areas: profiling, benchmarking, GPU utilization, and scalable training/inference systems.

Distributed systemsEvent streamingK8s workloadsWorkflow orchestrationData systemsPerformance optimizationML infra (focus)GPU efficiency (building)

Contact

© 2026 Kasi Viswanath Systems