Lead Software Engineer with over 10 years of experience in designing, scaling, and operating large-scale distributed systems across cloud-native environments. Specialized in Kubernetes, AWS, observability, and site reliability engineering, with proven expertise in driving reliability, automation, and cost optimization across 400+ EKS clusters. Strong background in AI/ML applications, including prompt engineering, LangChain-based automation, and AI-driven tools for root cause analysis and changelog generation. Recognized for technical leadership, cross-functional collaboration, and mentoring engineers to adopt modern SRE and AI practices.
LFS101x.2: Introduction to Linux