Site Reliability Engineer with 6 years of experience in incident management and cloud operations. Achieved 99.9% uptime and reduced mean time to detect (MTTD) by 90% through automation and proactive solutions. Proficient in GCP, Kubernetes, and CI/CD, enhancing system reliability and deployment efficiency.
• Local LLM deployment: Hosted Hugging Face LLM with Flask, monitored with Prometheus
• Mental health chatbot: Node.js, React, and Gemini API, CI/CD via Docker, and GitHub Actions