Consultant – RNA Platform and Reliability Engineering.| Automotive Domain
Cross-Functional Leadership and Technical Delivery:
- Successfully transitioned across RNA teams—SRE (SLA/SLO), Vehicle Lifecycle Management (VLM), and Campaign & Service Activation—to deliver reliability dashboards, platform enhancements, and live-site leadership.
Site Reliability Engineering (SRE) – SLA/SLO Dashboard (Project Lead & Contributor)
- Led the design and implementation of SLA/SLO observability dashboards across key RNA domains, driving the adoption of SRE practices and defining KPIs for reliability and availability used in weekly operations reviews.
- Standardized telemetry and logging across services to enable accurate KPI measurement and proactive alerting.
- Built automated data pipelines for ingesting, transforming, and visualizing reliability signals for executive and DRI consumption.
- Automated ETL workflows using Azure Data Factory; configured Automation Accounts and Runbooks to enhance operational efficiency.
- Analyzed performance metrics using KQL in Azure Monitor; maintained Azure SQL Database for reliable data management.
- Collaborated via Azure DevOps Boards and Repos to ensure streamlined team coordination and delivery.
Platform Engineering – Automotive Lifecycle Management (ALM)
- Delivered features for onboarding new vehicles/devices, linking assets, and establishing bootstrap flows across multiple OEM/device variants.
- Defined integration patterns for high-throughput, event-driven scenarios in collaboration with cross-functional stakeholders.
- Orchestrated microservices using Azure Service Fabric; implemented Azure Functions for scalable serverless computing.
- Managed data storage and retrieval using Azure Cosmos DB, Azure Tables, and Azure Queues to optimize performance and scalability.
Operations & Incident Management – Campaign and Service Activation (Module Lead / Incident Manager)
- Led live-site operations, diagnosing production issues, deploying customer-facing fixes, and restoring services within SLA constraints.
- Drove post-incident reviews and accountability to prevent recurrence and improve activation flows.
- Developed incident response and RCA playbooks; leveraged Azure Monitor and ICM workflows for efficient incident handling.
- Governed Azure DevOps work items to maintain consistent project oversight and delivery.
Copilot Agents Development
- Designed and optimized Copilot agents for a major American retail client to automate invoice processing and analysis.
- Technologies used: Copilot Studio, Power Automate, Office Scripting.
Azure Observability Framework (AOF) – Lead Developer
- Spearheaded the development of a plug-and-play observability framework for scalable alert management across Azure resources.
- Integrated Azure Monitor and Log Analytics to enhance custom log monitoring and downstream system notifications.
- Revamped framework architecture to reduce complexity and improve deployment speed.
- Created automation scripts and ARM templates for consistent alert generation and resource provisioning.
- Built customizable alerting solutions supporting both native Azure metrics and custom logs.
- Developed scalable serverless applications using Azure Functions; implemented CI/CD pipelines via Azure DevOps.
- Secured application secrets with Azure Key Vault; automated workflows using PowerShell and Automation Accounts.