

Results-Driven IT Manager | Big Data | DevOps | Cloud | AIops
Innovative and performance-focused IT leader with 12+ years of experience driving large-scale Big Data, Cloud, and DevOps transformations. Holds an M.Tech in VLSI & Embedded Systems with a proven history of architecting and delivering high-availability, enterprise-grade solutions across Consumer, Industrial, Financial, and Healthcare domains.
Big Data Expertise: Specialized in Hadoop ecosystem administration (Cloudera, Hive, HBase, NiFi, Kafka) with deep hands-on experience in cluster deployment, capacity planning, performance tuning, monitoring, and disaster recovery strategies.
Cloud & DevOps Mastery: Skilled in AWS and Azure cloud platforms, implementing CI/CD pipelines, Docker, Ansible, and Terraform to streamline deployments, enhance scalability, and ensure secure environments.
AIops & Automation: Pioneered the integration of Generative AI and Prompt Engineering to automate IT operations, achieving AI-driven log analysis, incident triaging, and root cause identification, significantly reducing downtime and boosting system reliability.
Data & Analytics: Proficient in SQL administration and integrating structured/unstructured data into Big Data pipelines for real-time analytics, dashboards, and enterprise reporting.
Kafka Administration: Extensive experience in Kafka cluster management, including setup, replication, partitioning, ACL-based security, and performance optimization for high-throughput data systems.
Leadership & Delivery: Proven record of leading cross-functional global teams, mentoring junior engineers, managing resource planning, and driving Agile/Scrum project delivery. Recognized for building high-performing teams, fostering knowledge sharing, and developing SOPs that improved operational efficiency and minimized incident response times.
Agile & Scrum methodology
Team Management, Mentoring & Stakeholder Collaboration
Team leadership
Resource optimization
Infrastructure management
Effective communication
Strategic planning
Conflict resolution
Problem solving
Process optimization
Budget control
Shift scheduling
Complex Problem-solving
Workforce management
Stakeholder management
Project planning
Key performance indicators
Documentation and reporting
Goal setting
Coaching and mentoring
Decision-making
Resource allocation
Contract management
Operations management
Technical mentoring
ITIL Certified – Incident, Problem & Change Management
Hadoop Ecosystem (HDFS, YARN, MapReduce, Hive, Impala, HBase, Sqoop)
Kafka (Setup, Administration, Security, Optimization)
Spark (Spark SQL, Spark MLlib, Data Processing & Analytics)
NiFi,Zookeeper
Cloudera, Hortonworks (HDP/HDF), Ezmeral Data Fabric (MapR)
AWS (EC2, S3, IAM, RDS, EMR, Lambda, CloudWatch, Auto Scaling, Security)
Azure & GCP (exposure)
Docker, Kubernetes, Ansible, Terraform
CI/CD (Jenkins, Git, GitHub/GitLab, Bitbucket)
Shell Scripting, Python automation
Prometheus, Grafana, Nagios
ElasticSearch, Kibana (Log Analytics, Visualization)
Splunk (exposure)
Generative AI for IT operations (AI-driven log analysis, RCA, incident triaging)
Prompt Engineering for automation, troubleshooting, and knowledge base generation
AIOps (Monitoring, Predictive Analytics, Automated Remediation)
Hadoop administration
Cloud services
DevOps practices
AIOps implementation
Big data platforms
IT policy development
Project management
Process optimization
Team leadership
Resource allocation
Stakeholder engagement
Problem solving
Effective communication
Performance monitoring
Change management
IT risk management
Training and mentoring
Quality assurance
Documentation and reporting
IT governance
Technical onboarding design
Information security
Technical troubleshooting
Enterprise architecture
Team development
IT service management
IT compliance
IT resource use
Operating system management
ITIL framework
Technical leadership
Documentation management
Team collaboration
Application support
IT budgeting
Risk mitigation planning
Technical support oversight
Disaster recovery
IT infrastructure
Multitasking capacity
Teamwork and collaboration
IT infrastructure proficiency
Agile work processes
Team building
Strategic planning
Client Server Management
Vendor management
Roles & Responsibilities:
· Managed and led high-performing team of 21 engineers across two global regions in 24×7 support model.
· Oversaw resource management, recruitment, onboarding, and training programs to build scalable workforce.
· Designed and optimized shift schedules to ensure smooth execution of operations and SLA compliance.
· Acted as Onsite Coordinator to facilitate collaboration among distributed teams, customers, and stakeholders
· Directed end-to-end implementation and automation of Big Data platforms, Cloud services, and DevOps pipelines.
· Monitored SLA adherence, minimizing downtime and enhancing service reliability through proactive measures.
· Delivered cloud migration initiatives, improving system scalability, availability, and performance metrics.
· Championed process automation and AIOps to reduce manual intervention and improve operational efficiency.
·Led IT infrastructure projects to enhance system performance and reliability.
·Managed cross-functional teams to implement software solutions efficiently.
Developed IT policies to ensure data security and compliance standards.
·Provided technical support and training to staff on new systems.
·Evaluated emerging technologies for potential adoption within the organization.
·Facilitated change management processes during system upgrades and migrations.
·Developed and implemented IT policies and procedures to ensure compliance with industry standards.
·Monitored system performance and identified areas for improvement.
·Maintained strong knowledge of applicable regulations to guarantee that designs, operations and IT systems met those requirements.
Roles & Responsibilities:
· Lead a team of Hadoop administrators to manage enterprise-scale Big Data platforms ensuring high availability, scalability and performance.
· Performed Ambari, HDP, and HDF upgrades, patch management, and continuous platform improvements.
· Administered and optimized Hadoop, Kafka, and Snowflake solutions, ensuring seamless integration with business-critical applications.
· Managed capacity planning, hardware provisioning, and AWS instance resizing to meet performance and cost objectives.
· Monitored clusters, performed backup & recovery, security compliance, and RCA-based issue resolution.
· Implemented automation scripts to streamline monitoring, alerts, and administrative tasks.
· Collaborated with application and infrastructure teams to deploy new environments, integrate ElasticSearch/Kibana, and provide analytics through Spark MLlib.
· Delivered knowledge transfer, documentation and mentoring for junior administrators.
·Led team in implementing efficient workflows and best practices.
·Coordinated cross-functional meetings to enhance project collaboration.
·Mentored junior staff on industry standards and operational procedures.
·Developed training materials to improve team knowledge and skills.
·Monitored project timelines to ensure adherence to schedules.
·Streamlined communication channels among team members and departments.
·Evaluated performance metrics to identify areas for improvement.
·Facilitated problem-solving sessions to address operational challenges.
·Provided leadership and guidance to team members, ensuring that tasks were completed on time and to a high standard.
·Delegated daily tasks to team members to optimize group productivity.
Roles & Responsibilities:
· Directed the administration of Hadoop clusters (HDP/HDF), including patching, version upgrades, and high availability configuration.
· Managed user provisioning, quotas, security (Kerberos, ACLs), and Ambari-based service administration.
· Conducted cluster performance tuning, health checks, disk management, and log monitoring.
· Ensured smooth data migrations using DistCp and NiFi (RDS → Hadoop clusters).
· Coordinated with global teams for incident resolution, RCA documentation and service provider escalations.
· Integrated Kafka and Spark on HDFS for data ingestion and processing.
· Designed backup & disaster recovery (BDR) strategies and tested failover mechanisms.
Roles & Responsibilities:
· Installed, configured, and managed multi-node Hadoop clusters using Ambari and Cloudera distributions.
· Upgraded and maintained Hadoop services, ensuring cluster reliability, scalability and data consistency.
· Configured and managed core-site.xml, hdfs-site.xml and mapred-site.xml for custom business requirements.
· Conducted data migrations across clusters and maintained secure environments via Kerberos and Zookeeper coordination.
· Administered Kafka, Hive, HBase, NiFi, and Spark, enabling seamless data flow pipelines.
· Performed performance tuning for Hadoop jobs, disk utilization monitoring, and cluster health reviews.
· Worked closely with application teams for patch planning, version upgrades and troubleshooting.
Roles & Responsibilities:
· Installed and deployed Hortonworks Data Platform on multi-node clusters through Ambari.
· Managed commissioning/decommissioning of nodes, user provisioning, and quota management.
· Configured Kafka, Spark, Cassandra, and HBase with Hadoop to support IoT and banking sector analytics use cases.
· Implemented backup, disaster recovery, and security policies to safeguard enterprise data.
· Integrated Elasticsearch and Kibana with Hadoop for log analytics and reporting.
· Handled Linux administration tasks (user/group management, permissions, package installation using YUM/RPM).
· Defined data retention policies and performed cluster monitoring, issue resolution and proactive capacity planning.
Documented operational procedures, knowledge base articles, and SOPs for future use
• ITIL 4 Foundation Certification CNO:- GR671355827SA
•IT Management Certification For Top-B School ISB Hyderabad
•AWS Certified Solutions Architect -Associate
•Leardership Excellence Acceleration Program HPE
PAPER PUBLISHED
Name of journal: - IJMETMR, Serial no:- ISSN No:2348 – 4845
Topic:- Design of All-Optical Reversible Logic Circuits Using Novel Optical Reversible Gates
Name of journal: - IJSER Serial, no: - ISSN 2229-5518
Topic:- Crime Data Analytics using Big Data Hadoop Spark & Zeppelin
MEMBERSHIPS
Engineering for Change-E4C Community (Supported by IEEE, ASME, ASCE, OSA)
International Association of Engineers (IAENG)