ML- and AIOps-oriented Automation, DevOps, and Reliability Engineer with 5+ years of experience designing intelligent automation, observability-driven monitoring, and self-healing systems for cloud-native applications. Strong background in Python and Java development, CI/CD engineering, performance and reliability testing, and cloud operations . Hands-on experience with metrics, logs, anomaly detection, and system health analysis using Prometheus and Grafana . Proven ability to reduce downtime, improve MTTR, and automate failure detection and remediation—well aligned with Oracle OCI, AIOps, SRE, and ML-driven operations roles .
Programming & Development: Python (automation, performance, metrics/log analysis), Java (automation frameworks, backend validation), REST APIs, SQL, Bash/Shell scripting
Cloud, DevOps & Platform Engineering: Oracle Cloud Infrastructure (OCI), AWS, Azure CI/CD (Jenkins, GitLab CI) Docker Deployment & release validation Infrastructure-aware testing
AIOps, Observability & Reliability: Prometheus, Grafana Metrics & log analysis Anomaly detection Incident triage & RCA MTTR reduction Self-healing & auto-remediation Reliability & resilience engineering
Performance & Systems Engineering: Load, stress & scalability testing (JMeter, Locust) Capacity planning Distributed systems validation Production-scale traffic simulation
Security, Quality & Resilience: OWASP ZAP, Burp Suite API reliability & fault tolerance Functional, integration, regression & system testing Shift-left & risk-based testing
Data, ML & AIOps Foundations: Data-driven failure analysis Trend & pattern detection AIOps fundamentals Predictive-monitoring-aligned automation
Engineering Practices: Agile / DevOps (Scrum) Jira, Azure Boards Cross-functional collaboration (Dev, SRE, Platform, Infra) Technical documentation & reporting