Results-driven Linux System Administrator with 12+ years of IT experience, including 6 years focused on Linux systems. Skilled in AWS, Ansible, Terraform, Kubernetes, and Jenkins, proficient in optimizing infrastructure for enhanced reliability and efficiency.
· Periodically performing upgradation and patching on RHEL / SLES / Ubuntu servers
· Performing storage migration on physical servers.
· Worked closely with multiple teams like Application, Network, Database, storage teams for performance related issues.
· Managing SUSE Cluster, putting in maintenance, moving resources from one node to another and troubleshooting cluster related issues
· Identify and troubleshoot issues related to file systems, CPU, I/O and memory utilization, boot time.
· Automation of tasks using Cron and Shell scripting
· Worked closely with OS vendors like REDHAT / SUSE during problem tickets (RCA) and in coming to a solution.
· Implementing NIC bonding.
· Adding routes as per customer or other teams request and make them persistent.
· Migrating servers from Centos 7 to RHEL 8.
· Worked on Application support, like Apache, Tomcat, making changes in configuration files as per Customer DEV team.
· Preparing POA (points of action) for any activity on customer Servers
· Worked on Incidents, Service requests and Change tickets from Service Now
· Managing server hardware like replacement of faulty HDD, RAM with help of vendor
· Worked with server hardware vendors like DELL, HP, CISCO during hardware related issues.
· Implementing hardening on newly deployed servers.
· Conducting OS patching to mitigate vulnerabilities and enhance system security
· Migrating physical server to virtual machine.
· Migrating servers from solaris 10 to RHEL 7.
· Provisioning new vm as per application teams requirement.
· Managing file systems and disk partitions.
· Creating bash scripts and scheduling cron jobs for automating manual tasks.
· Administering nagios monitoring tool (Adding / removing servers).
· Performing firmware upgrades on servers to ensure optimal performance and security.
· Monitoring daily backup & troubleshooting failed backup task
· Monitoring servers with HP Temip.
· Troubleshooting as per SOP.
· Managing file system, disks partitions.
· Troubleshooting file system related issues.
· Working as per ITIL process (Incident, Problem, Change).
· Investigate and troubleshoot the hardware issues and coordinate with the vendors.
· Monitoring servers and application operations to maintain uptime.
· Generating and allocating incident tickets across relevant teams.
· Conducting regular application sanity checks to ensure proper functionality.
· Coordinating with various hardware vendors to address server hardware issues.
· Facilitating communication with relevant teams to address ongoing issues promptly.
· Preparing daily incident reports for review and analysis.
· Developing standard operating procedures (SOP) for the implementation of sanity checks for the new application.
· Conducting training sessions for newly joined team members
AWS Certified Solution Architect Associate
ITIL Foundation
RHCE – 150-025-559
AWS Certified Solution Architect Associate
ITIL Foundation
RHCE – 150-025-559