Managing over 2,000 ECS and 170,000 CDP clusters in the Compute Operations (DevOps) team, using Docker containers and gaining expertise in networking
Worked on a Kubernetes namespace management service for Cloudera's environment
The tool continuously monitors the cluster, classifies namespaces, and applies appropriate labels
It ensures security compliance by managing role bindings and SCCs
I implemented logic to detect control planes using Kubernetes API queries and HTTP endpoint validation
Additionally, I automated namespace expiry resets and security configurations for OpenShift clusters
This solution significantly reduced manual effort, improved security posture, and ensured better namespace organization across the cluster
Major contribution to the 'Prewarming of Nodes' project, which addressed delays and errors during OS installation in cluster provisioning by pre-installing the OS early
Created disk images to boot the system into a LiveOS and then write image to host disk
This reduced deployment time from 45 minutes to 10 minutes, and lowered CDEP failure rates to below 5 percent
Developed an advanced workflow for cluster deployment in Ansible, with comprehensive process tracking in AWX
The workflow integrates a wide range of playbooks and jobs to manage end-to-end cluster deployment, including setting up the deployment environment, configuring TFTP, DHCP, and ISO-related packages, enabling PXE booting for nodes, implementing load balancing, and managing internal services
Additionally, it handles network configuration, DNS setup, and other critical components to ensure seamless deployment and operation
Played a key role in the development of HAWK, a Slack-integrated chatbot that enables users to manage their YCloud applications, clusters in KCloud, and resources in shared OpenShift KCloud clusters
Enhanced the chatbot by adding multiple endpoints, expanding functionality, and supporting different OS versions
Key improvements include simplifying host extraction and listing within clusters and pools-previously a complex task-by enabling seamless Animus queries, significantly reducing effort and saving time
Additionally, implemented a feature allowing users to extend namespace expirations with a simple Animus chatbot command, along with other enhancements that improve user convenience and efficiency
Handling critical bug fixes, ticket resolutions for cluster changes, and audits for active and decommissioned hosts along with assisting teams in resolving cluster deployment issues
SDE Intern
BlackRock
01.2022 - 08.2022
Worked in the Payments Development Team within the ALADDIN Product Group (Investment Operations Division)
Developed a Paycheck application for retrieving, modifying, and updating payments
Designed the front-end using Angular, making it easier for clients to identify and resolve cash flow or payment mismatches
Enabled clients to track payment statuses by providing essential criteria such as Asset ID, Record Date, Payment Date, and Currency Type