Pranav Bhendawade
Cloud & Platform Engineer
Software Engineer with 5+ years architecting and operating reliable, scalable distributed systems. Expertise in Kubernetes on AWS (EKS) and GCP (GKE), IaC, GitOps, and service mesh networking. Owned services end-to-end: production code, on-call rotations, incident response, and AI-powered operational tooling. Expanding into AI/ML infrastructure — GPU scheduling, vLLM serving, and agentic observability tooling.

About Me
I’m a Software Engineer with 5+ years architecting and operating reliable, scalable distributed systems. Currently at Automation Anywhere, I own services end-to-end: production code, on-call rotations, incident response, and AI-powered operational tooling across 90+ GKE/EKS clusters.
I’m passionate about platform engineering, infrastructure automation, and the intersection of cloud-native systems with AI/ML — from GPU workload orchestration to agentic observability tooling.
Experience
Cloud Engineer
Automation Anywhere- Engineered an Istio service mesh across 90+ production Kubernetes clusters (EKS/GKE) with mTLS, SNI-based routing, and east-west gateways, securing 100% of cross-cluster traffic including AI/ML inference workloads.
- Architected a two-tier IaC architecture with a GitOps-driven control plane for environment lifecycle and a Terraform infra plane for AWS/GCP provisioning, reducing onboarding time by 70% and eliminating manual configuration drift.
- Standardized Helm charts and ArgoCD GitOps pipelines across 90+ clusters, cutting deployment cycle time by 40% and manual configuration errors by 60%.
- Diagnosed production issues (network partitions, DNS failures, pod scheduling, Envoy misconfigs), reducing resolution time by 35% while optimizing fleet-wide autoscaling for 30–40% cost reduction.
- Developed a Python-based incident response agent using FastAPI and MCP that autonomously correlates logs, metrics, and K8s events, reducing MTTD by 60% and cutting manual triage across on-call rotations.
- Tech Stack: Kubernetes, Helm, ArgoCD, Istio, Terraform, AWS (EKS), GCP (GKE), Python, FastAPI, MCP, Prometheus, Grafana
Software Engineer Intern, Linux Systems
Activision Blizzard- Constructed an infrastructure configuration discovery service in Python that aggregated real-time resource data across multiple sources, providing accurate platform visibility and reducing deployment misconfigurations by 40%.
- Streamlined provisioning of service by containerizing and deploying on Kubernetes with Terraform-provisioned infrastructure on OpenStack, reducing deployment time by 75% and provisioning time by 90%.
- Tech Stack: Python, Bash Scripting, Docker, Kubernetes, Terraform, OpenStack
Cloud Engineer
Tata Consultancy Services- Designed AWS VPC networking for EKS across 3 environments with CNI-aware subnet sizing, pod-level security groups, NACLs, and VPC peering for cross-environment service discovery supporting 100+ node clusters.
- Spearheaded on-call rotation and incident escalation for production infrastructure, established runbooks and drove root-cause analysis, contributing to 45% MTTR reduction using Prometheus/Grafana alerting and log analysis.
- Automated AWS resource inventory and cost reporting across 20+ regions using Python, reducing manual overhead by 80% with real-time utilization dashboards.
- Administered PostgreSQL RDS and ElastiCache Redis serving 15+ production services, tuned parameters and connection pooling for 99.9% uptime. Triaged Linux incidents (OOM kills, network stack, systemd) via strace and tcpdump.
- Tech Stack: Python, AWS (EC2, EKS, S3, IAM, RDS, ElastiCache, CloudWatch), Kubernetes, Prometheus, Grafana, ELK
Software Engineer Intern
CloudYuga Technologies- Implemented reusable Terraform modules for GKE networking, compute, and storage with GitHub Actions CI/CD checks, reducing infrastructure provisioning time from days to under 2 hours.
- Launched an EdTech platform on GKE using Kubernetes manifests with RBAC, ConfigMaps, Secrets, Ingress, and PVCs, reducing rollout failures by 35%.
- Tech Stack: Terraform, Kubernetes, GCP (GKE), GitHub Actions, Ruby on Rails, ReactJS
Education
Master of Science in Computer Science and Engineering
2022 - 2024Bachelor of Technology in Computer Science and Engineering
2015 - 2019Projects
GPU-Aware Autoscaling Platform on EKS
Orchestrated GPU workload scheduling on EKS using Karpenter and KEDA for autoscaling on GPU utilization and queue depth, reducing GPU idle time by 40%. Instrumented Prometheus/Grafana dashboards with NVIDIA DCGM metrics for real-time GPU visibility.
Kubernetes Troubleshooting Agent
Devised an agentic system triggered by webhooks that uses a FastAPI agent with a custom MCP server to query cluster metrics, logs, and events, cutting incident triage time by 50% and posting root-cause analysis to Slack.
Achievements



