Experienced in building and maintaining data pipelines to ensure seamless data flow, leveraging advanced knowledge of big data technologies to drive data-driven decision-making. Proven track record of enhancing data architecture for improved performance and reliability.
🔹 Itinerary Recommendation Engine (Luxury Hospitality Client): Built a personalized multi-label recommendation model using XG-Boost.Performed EDA to uncover user behavior patterns and improve model precision Deployed the solution in Databricks, driving improved engagement and increased revenue.
🔹 Amgen (IDNA Platform Integration):
Migrated legacy Horizon pipelines into Amgen’s IDNA data platform.
Built scalable ETL workflows using Databricks, PySpark, and Delta Lake
Implemented automation, versioning, and data quality controls to ensure pipeline reliability.
🔹 Dell (Medallion Architecture):
Designed and optimized data marts following the Medallion architecture pattern
Performed ELT using SQL, scheduled workflows via Databricks and Airflow
Worked with business teams to translate reporting needs into production-grade pipelines
🔹 Designed and deployed a Battery Safety Alarm system leveraging real-time IoT data and Exploratory Data Analysis (EDA); built a scoring model that delivered 95% alert accuracy, resulting in monthly cost savings exceeding ₹5 lakhs.
🔹 Developed a Python-based API connector to seamlessly integrate Google Sheets with internal databases, enabling real-time data ingestion; implemented robust ETL pipelines based on business logic for data validation, transformation, and downstream analytics.
🔹 Built interactive, self-serve dashboards for cross-functional business teams, providing real-time operational visibility and empowering data-driven decision-making across the organization.
Data Science using Python - Edureka
Apache Spark and Scala Developer - Edureka
SQL - Coursera/HackerRank