Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Vikas Garhwal

Indore

Summary

Experienced data engineering professional with 9+ years of expertise in designing and implementing comprehensive data solutions on cloud and on-premises platforms. Proficient in Azure Data Factory, Databricks, Delta Lake, Spark, Snowflake, and GCP. Skilled in constructing scalable data pipelines utilizing modern architectures such as Medallion (Bronze/Silver/Gold) and integrating with visualization tools like Power BI. Demonstrated success in team leadership, establishing data quality frameworks, and delivering compliance-focused solutions within healthcare and supply chain sectors. Strong emphasis on optimizing performance, ensuring reliability, and automating data platforms.

Overview

10
10
years of professional experience
1
1
Certification

Work History

Module Lead Software Engineer

Impetus Technologies
06.2021 - Current

Project Details:

1) Enterprise Reporting Analytics (ERA) – Nuxeo Recall Notification

Client : McKesson Technologies

Platform & Skills: Azure Data Factory, Logic Apps, Azure Databricks, Delta Live Tables, Delta Tables, GIT CI/CD, Genie AI, Snowflake.

Brief:

Nuxeo Recall Notification - outlines the ingestion and processing of Recall Notification data from the Nuxeo system into Databricks using a Medallion architecture. The initiative is compliance-driven, supporting McKesson’s government contract to deliver recall reports with invoice and lot numbers. It enables business users to generate actionable insights to ensure safety, regulatory compliance, and timely stakeholder communication.

Key Responsibilities:

  • Designed and developed ADF pipelines to ingest structured recall data from Nuxeo REST API and Snowflake into ADLS.
  • Implemented historical and daily incremental data loads using parameterized ADF pipelines.
  • Built robust, reusable Databricks notebooks for transformation logic across Bronze, Silver, and Gold layers, including schema standardization and validation.
  • Created Gold layer datasets for reporting and operational insights.
  • Enabled Power BI dashboards by publishing Gold layer tables as SQL endpoints via Databricks SQL and integrating them into Power BI workspace using Azure Active Directory authentication.
  • Integrated Genie AI for exploratory analysis on Gold layer data to perform anomaly detection and generate intelligent recall risk assessments—complementing Power BI by providing predictive insights and operational foresight.
  • Enabled CI/CD using GIT integration under ADF.
  • Authored Technical Design Documents covering ingestion design, transformation logic, and data lineage.


2) Enterprise Reporting Analytics (ERA) - Modernization

Client : McKesson Technologies

Platform & Skills: Azure Data Factory, Logic Apps, Azure Databricks, Delta Tables, GIT CI/CD, Genie AI, Snowflake.

Brief:

Enterprise & Reporting Analytics(ERA) is customer facing Web App platform which helps in user authorization and reporting needs. These workloads were running on HDInsights Infrastructures and modernized to the Databricks platform.

Key Responsibilities:

  • Led the team of 3 engineers with handling design, development, and implementation of a data analytics pipeline to migrate the HDI running jobs like ERA, DSCSA and C360 jobs to Databricks.
  • Designed and modernized the legacy application running on HDInsight infrastructure. Proposed the migration strategy and led the team in transitioning existing data processing jobs like ERA, DSCSA and C360 to the Azure Databricks platform, ensuring improved performance, scalability, and maintainability.
  • Formulated a data quality framework utilizing Delta tables and expectations.
    Crafted a versatile notebook for the creation of bronze and silver tables, promoting code reuse.
  • Established metadata-driven pipelines in Azure Data Factory (ADF) for data extraction from Snowflake, Oracle, Blob, and Azure SQL Server.
  • Designed and implemented a visualization dashboard using Databricks SQL for operations and maintenance.
  • Logic App code development for job notifications and failures.
  • Enabled CI/CD using GIT integration under ADF.
  • Prepared Technical Design documentation for ADF.


3) Enterprise Reporting Analytics (ERA) - HDI

Client : McKesson Technologies

Platform & Skills : Hive, Sqoop, Oozie, Spark, Scala, Snowflake, HDInsights.

Brief:

Enterprise & Reporting Analytics(ERA) is customer facing Web App platform which helps in user authorization and reporting needs, This also helps McKesson's customers and internal teams with reports for purchase analysis, savings, client contract compliance, DSCSA regulatory compliance.
It also helps in Product availability report, it’s shortages and trends.
Whereas ERA DE workload helps with Connect and SFDC users onboarding to platform, their account permissions and reports related to ERA(ERA) and DSCSA, as well as C360 reports.

Key Responsibilities:

  • Developed data pipelines that entail exporting data from various sources like MySQL, Oracle and Snowflake, performing processing and transformation tasks with Hive queries, and subsequently storing the resultant data in Snowflake and MySQL.
  • Design End to end Process designer for ERA, DSCSA and C360 jobs.
  • Written workflow in hive to process large sets of data.
  • Written SQOOP scripts to import data from SQL-Server.
  • Snowflake is a written task to push data into HDFS and vice versa.
  • Optimization using Spark for existing use cases.


4) EC & MSD Pricing

Client : McKesson Technologies

Platform & Skills : Hive, Sqoop, Oozie, Spark, Scala, Snowflake, GCP.

Brief: EC & MSD Pricing project was developed to schedule on a daily basis which deals with the capturing of different accounts and items data along with their vendors information, also their forecast and previous tracks with their pricing types, rules and rebates are fetched from the source systems such as oracle and then data is cleaned and transformed using various data engineering tools and google cloud technologies and this transformed data is again pushed back to the different oracle tables on different servers using Soap API’s and also into hive.

Key Responsibilities:

  • Solutioning the overall design with automated pipeline run through Cloud scheduler, triggering dataproc cluster and workflow, ingest and process the data. At the end of the job, delete the dataproc cluster.
  • Development of overall solution and resolve issues during this activity.
  • Managing GCP Platform for the use of case deployment.
  • Designing and configuration Of Oozie workflows with changing requirement for EC& MSD Pricing.
  • Data Integrity and performance tuning for the EC & MSD Pricing data.
  • Dealing with source systems issues and working with on-site teams and helping them with the data issue.

Senior Software Engineer

Impetus Technologies
08.2019 - 06.2021

Project Details:

1) EC & MSD Pricing
Client : McKesson Technologies

Platform & Skills : Hive, Sqoop, Oozie, Spark, Scala, Snowflake, GCP.

Brief: EC & MSD Pricing project is developed to schedule on a daily basis which deals with the capturing of different accounts and items data along with their vendors information, also their forecast and previous tracks with their pricing types, rules and rebates are fetched from the source systems such as oracle and then data is cleaned and transformed using various data engineering tools and google cloud technologies and this transformed data is again pushed back to the different oracle tables on different servers using Soap API’s and also into hive.

Key Responsibilities:

  • Solutioning and Development on the GCP automated pipeline using Cloud scheduler for job scheduling, Spinning Dataproc cluster, running workflow, Data Ingestion and Processing and loading to Snowflake and later deletes the Dataproc cluster.
  • Managing GCP Platform for the use of case deployment.
  • Designing and configuration Of Oozie workflows with changing requirement for EC& MSD Pricing.
  • Data Integrity and performance tuning for the EC & MSD Pricing data.
  • Dealing with source systems issues and working with on-site teams and helping them with the data issue.


2) MMS - Varicent and HighSpot
Client : McKesson Technologies

Platform & Skills : Matillion , Snowflake.
Role: Develop, Maintain and Support the HighSpot and Varicent data from SFTP server using the Matillion ETL tool and ingest in to the Snowfalke Systems. Support the day to day activities.
Key Responsibilities:

  • Ingest files in batch and incremental mode.
  • The SFTP system should be bringing data from McKesson approved SFTP tool, used GoAnywhere.
  • Pipeline should be scheduled using Control-M.
  • Provide reconciliation reports monthly.

Software Engineer

Impetus Technologies
12.2018 - 08.2019

Project Details:
1) MDNA-Mckesson
Client : McKesson Technologies

Platform & Skills : Hive, Sqoop, Oozie, Spark, Scala, Snowflake, GCP.

Role : MDNA-Mckesson project encompass variety of use cases such as PCS-Sales Performance, Supplier Scorecard, Cash Receipts, Working Capital and EC_Pricing etc. that were needed to be migrated from On-premise Hadoop ecosystem to the Google Cloud Platform. During this rigorous exercise, worked on the various aspects of Hadoop and cloud technologies such as development, deployment, monitoring and testing, which helped in garnering the skills related to migration. Have Also worked with Snowflake Use cases for Data Archival and Snowflake Job Monitoring and its Notification with python and shell scripting.

Key Responsibilities:

  • Hadoop On-prem data migration and use case deployment on the GCP platform.
  • Development of data ingestion scripts from On-prem to Google storage.
  • Conversion of Azkaban workflows to Oozie workflows.
  • Automation around use cases for Ingestion, Hive Tasks and Model execution.
  • Testing for the migrated data.
  • Snowflake Data warehouse and python use case development for few features such as job notification and data archival using system tables.

Software Engineer

Yash Technologies
09.2016 - 12.2018

Project Details:

1) NextERP
Client : Merck KGaA

Platform & Skills: Hadoop, Sqoop, Hive, Oozie, MapReduce, MySQL
Brief: The Next ERP System is business process management software that allows organization to use a system of integrated applications to manage the business and automate the process related to technology, services and human resources.
Key Responsibilities:
• Extracted the data from RDMS sources into HDFS using Sqoop using On-premises. • Performed transformations, cleaning and filtering on imported data using Hive, and loaded final data into HDFS.
• Handled importing of data from various data sources, performed transformation using Hive, Map/Reduce, loaded data into HDFS from the MYSQL into HDFS using Sqoop analyzed the data by performing Hive Queries and running Scripts to study customer behaviors.
• Divide the use case into certain steps like data extraction, validation, upload into HDFS and execution.
• Uploaded data to Hadoop Hive and combined new tables with existing databases. Experience in using Avro, Parquet, RCFile, JSON file formats and developed UDF’s using Hive.
• Developed job flows in Oozie to automate the workflow for Sqoop and Hive jobs.

2) Wh-Picker App
Client : Merck KGaA

Platform & Skills:- Java, Android, SQLite, Web services, JIRA and GIT.
Brief:
Warehouse picker (Wh-picker) is an android application framework-based app which was designed for pickers for managing the warehouse orders. The processes used in this application were Inbound, Outbound, Staging and Internal Operations for Merck warehouses.
Key responsibilities:
• Developing and implementing the requirements from warehouse pickers.
• Understanding requirements from real time users by visiting warehouse or production plant.
• Was part of China Go Live for Wh-picker application and provided support to end users.
• Debugging and Testing in various software development environments.
• Trained the end users and on-spot bug fixing of reported issues by them related to applications.

Associate Software Engineer

Yash Technologies
11.2015 - 09.2016

Company: Yash Technologies

Role: Developer

Key Responsibilties:

Worked as a Junior Developer and gained hands-on experience across multiple technologies, including Java web application frameworks (Spring, Hibernate) and Android development. Contributed to the development of several Proofs of Concept (PoCs) and consistently delivered reliable code as a dependable team member.

Module Lead Software Engineer

McKesson Technologies
04.2024 - 11.2024
  • Developed data pipelines that entail exporting data from various sources like MySQL, Oracle and Snowflake, performing processing and transformation tasks with Hive queries, and subsequently storing the resultant data in Snowflake and MySQL.
  • Design End to end Process designer for ERA.
  • Written workflow in hive to process large sets of data.
  • Written SQOOP scripts to import data from SQL-Server.
  • Snowflake is a written task to push data into HDFS and vice versa.
  • Optimization using Spark for existing use cases.

Module Lead Software Engineer

McKesson Technologies
01.2023 - 03.2024
  • EC & MSD Pricing project is developed to schedule on a daily basis which deals with the capturing of different accounts and items data along with their vendors information, also their forecast and previous tracks with their pricing types, rules and rebates are fetched from the source systems such as oracle and then data is cleaned and transformed using various data engineering tools and google cloud technologies and this transformed data is again pushed back to the different oracle tables on different servers using Soap API’s and also into hive.
  • Development and enhancement of the new customer requirements in the project.
  • Managing GCP Platform for the use of case deployment.
  • Designing and configuration Of Oozie workflows with changing requirement for EC& MSD Pricing.
  • Data Integrity and performance tuning for the EC & MSD Pricing data.
  • Dealing with source systems issues and working with on-site teams and helping them with the data issue.

Senior Software Engineer

McKesson Technologies
05.2021 - 12.2022
  • MDNA-Mckesson project encompass variety of use cases such as PCS-Sales Performance, Supplier Scorecard, Cash Receipts, Working Capital and EC_Pricing etc. that were needed to be migrated from On-premise Hadoop ecosystem to the Google Cloud Platform. During this rigorous exercise, worked on the various aspects of Hadoop and cloud technologies such as development, deployment, monitoring and testing, which helped in garnering the skills related to migration. Have Also worked with Snowflake Use cases for Data Archival and Snowflake Job Monitoring and its Notification with python and shell scripting.
  • Hadoop On-prem data migration and use case deployment on the GCP platform.
  • Development of data ingestion scripts from On-prem to Google storage.
  • Conversion of Azkaban workflows to Oozie workflows.
  • Automation around use cases for Ingestion, Hive Tasks and Model execution.
  • Testing for the migrated data.
  • Snowflake Data warehouse and python use case development for few features such as job notification and data archival using system tables.

Developer

Service Source
05.2020 - 10.2020
  • Service Source ETL project dealing with the extraction of various use case data which pulled the data from FTP server and put the data to the AWS S3 location. Multiple data pipelines are triggered on to prod and lower environment for various use cases entities using AWS EMR service which extracts this data and after transforming it finally loads it into the use case tables.
  • Development and modification within application with changing client requirements.
  • Bug fixing and resolving AWS EMR Upgrades/Migration issues on lower environment.
  • Development/Conversion of existing Oozie workflow scripts to Apache Airflow for Partition Compaction module.
  • Production Support for already existing use cases and Verification of the same.

Developer

McKesson
01.2019 - 04.2020
  • Hadoop On-prem data migration and use case deployment on the GCP platform.
  • Development of data ingestion scripts from On-prem to Google storage.
  • Conversion of Azkaban workflows to Oozie workflows.
  • Automation around use cases for Ingestion, Hive Tasks and Model execution.
  • Testing for the migrated data.
  • Snowflake Data warehouse and python use case development for few features such as job notification and data archival using system tables.

Module Lead Software Engineer

McKesson Technologies
04.2025 - Current
  • Led the design, development, and implementation of a data analytics pipeline to process Recall Notification data from Nuxeo REST APIs into Databricks, following the Medallion architecture (Bronze/Silver/Gold). This initiative supports McKesson’s compliance obligations related to product recalls and enables the business to monitor trends, issue reports, and proactively engage customers.
  • Designed and developed ADF pipelines to ingest structured recall data from Nuxeo REST API and Snowflake into ADLS.
  • Implemented historical and daily incremental data loads using parameterized ADF pipelines.
  • Built robust, reusable Databricks notebooks for transformation logic across bronze, silver, and gold layers, including schema standardization and validation.
  • Created Gold layer datasets for reporting and operational insights.
  • Enabled Power BI dashboards by publishing gold layer tables as SQL endpoints via Databricks SQL and integrating them into Power BI workspace using Azure Active Directory authentication.
  • Integrated Genie AI for exploration analysis on gold layer data to perform anomaly detection and generate intelligent recall risk assessments—complementing Power BI by providing predictive insights and operational foresight.
  • Authored Technical Design Documents covering ingestion design, transformation logic, and data lineage.

Software Engineer

Yash Technologies
11.2015 - 12.2018
  • Warehouse picker (Wh-picker) is an android application framework-based app which was designed for pickers for managing the warehouse orders. The processes used in this application were Inbound, Outbound, Staging and Internal Operations for Merck warehouses.
  • Developing and implementing the requirements from warehouse pickers.
  • Understanding requirements from real time users by visiting warehouse or production plant.
  • Was part of China Go Live for Wh-picker application and provided support to end users.
  • Debugging and Testing in various software development environments.
  • Trained the end users and on-spot bug fixing of reported issues by them related to applications.

Developer

Yash Technologies
09.2017 - 10.2018
  • The Next ERP System is business process management software that allows organization to use a system of integrated applications to manage the business and automate the process related to technology, services and human resources.
  • Extracted the data from RDMS sources into HDFS using Sqoop using On-premises.
  • Performed transformations, cleaning and filtering on imported data using Hive, and loaded final data into HDFS.
  • Handled importing of data from various data sources, performed transformation using Hive, Map/Reduce, loaded data into HDFS from the MYSQL into HDFS using Sqoop analyzed the data by performing Hive Queries and running Scripts to study customer behaviors.
  • Divide the use case into certain steps like data extraction, validation, upload into HDFS and execution.
  • Uploaded data to Hadoop Hive and combined new tables with existing databases.
  • Experience in using Avro, Parquet, RCFile, JSON file formats and developed UDF’s using Hive.
  • Developed job flows in Oozie to automate the workflow for Sqoop and Hive jobs.

Education

Bachelor of Engineering - Information Technology

Medicaps Institutions
Indore, India
06-2015

Senior Secondary - Maths And Science

Star School
Indore
06-2011

Skills

  • Databricks
  • Delta Live Tables(DLT)
  • Azure Cloud Platform
  • Spark

  • Python
  • GCP
  • Hadoop
  • SQL

Certification

  • Databricks Certified Professional Data Engineer
  • Databricks Certified Associate Data Engineer


Timeline

Module Lead Software Engineer

McKesson Technologies
04.2025 - Current

Module Lead Software Engineer

McKesson Technologies
04.2024 - 11.2024

Module Lead Software Engineer

McKesson Technologies
01.2023 - 03.2024

Module Lead Software Engineer

Impetus Technologies
06.2021 - Current

Senior Software Engineer

McKesson Technologies
05.2021 - 12.2022

Developer

Service Source
05.2020 - 10.2020

Senior Software Engineer

Impetus Technologies
08.2019 - 06.2021

Developer

McKesson
01.2019 - 04.2020

Software Engineer

Impetus Technologies
12.2018 - 08.2019

Developer

Yash Technologies
09.2017 - 10.2018

Software Engineer

Yash Technologies
09.2016 - 12.2018

Associate Software Engineer

Yash Technologies
11.2015 - 09.2016

Software Engineer

Yash Technologies
11.2015 - 12.2018

Bachelor of Engineering - Information Technology

Medicaps Institutions

Senior Secondary - Maths And Science

Star School
Vikas Garhwal