Results-driven technology leader with 11 years of experience in software production, Devops, and Site Reliability Engineering(SRE), seeking the Director of software productionand SRE position at MORGAN STANLEY. Passionate about driving operational excellence, optimizing system reliability, and leading high-performing engineering teams to deliver scalable, resilient, and high-availability software solutions. Adept at implementing best practices in CI/CD, observability, incident management, and infrastructure automation to enhance performance and reduce downtime. Committed to fostering a culture of innovation, continuous improvement, and cross-functional collaboration to align engineering efforts with business objectives.
Professional Summary:
Seasoned Production Support/Site Reliability Engineering (SRE) Leader with expertise in ensuring the stability, reliability, and performance of mission-critical production environments. Proven ability to proactively detect, troubleshoot, and resolve complex issues impacting applications, coordinating cross-functional teams (development, infrastructure, and business stakeholders) to minimize downtime, and maintain seamless operations.
Key Strengths:
- Incident & Outage Management: Own end-to-end resolution of escalated production issues, ensuring clear communication, root cause analysis (RCA), and workarounds to restore service rapidly.
- Process Optimization: Develop and enforce policies for Change Implementation Management (CIM), deployment standards, and proactive monitoring to prevent recurring incidents.
- Knowledge Leadership: Build and maintain a centralized knowledge base to empower teams with self-service troubleshooting tools, reducing dependency on external resources.
- Collaborative Governance: Partner with development teams to embed production readiness early in the SDLC, ensuring new systems meet reliability, scalability, and operational standards.
- Continuous Improvement: Advocate for observability-driven practices (logging, metrics, APM) and post-mortem reviews to drive long-term system resilience and user experience enhancements.
Committed to fostering a culture of accountability, transparency, and innovation while balancing rapid issue resolution with strategic prevention.
Project: RJ Reynolds
Project: Sanofi
Project: WSI (Williams-Sonoma, Inc.)
Project: AMERIHEALTH CARITAS
Project: Nielsen - Census Collections
Project: HSN (Home Shopping Network)
English, Hindi, Telugu
https://bold.pro/my/ramakrishna-yalla/162r