Data Engineer with 3.5+ years of experience in SQL, PySpark, Databricks, and Microsoft Azure. Skilled in data analysis, pipeline development, IDF configuration, and building scalable data solutions using Azure Data Factory, Synapse, Autoloader, Delta Live Tables (DLT), and Microsoft Purview. Highly adaptable and motivated self-learner with a strong ability to work both independently and collaboratively. Seeking a challenging full-time role in a reputable organization to contribute technical expertise, learn from experienced professionals, and drive impactful data solutions.
Worked on development of reusable generic worker pipeline in Azure Synapse Analytics to handle .xlsx,.csv file ingestion through Metadata Driven Copy Data Tool Framework integrated with secure HTTP-linked services for authentication.
Worked on developing metadata for external static file loads (QAD, QMIS, product-classification, Xref) using metadata frameworks.
Designed and developed a metadata-driven copy framework for Workday implementation that ingests .xlsx files from SharePoint into ADLS Gen2
Streamlined column naming conventions across external sources by embedding transformations into metadata, preventing pipeline failures due to column naming conventions in the source file.
Worked on implementing pipeline feature in gold layer which will refresh the Dataflow and Dataset in PBI side.
Conducted POC for implementing row-level and object-level security in gold layer (for dimensions and facts), ensuring data access control in compliance with governance policies.
Conducted Analysis and testing for RLS/OLS for Silver and Gold Layer.
Data catalogue and Purview implementation- Learnt from Scratch and automated the data governance tasks, which resulted in deploying and tagging the entire Smart ERP data assets in production in less than two days.
Worked in Medallion Architecture and build scalable pipeline for data ingestion from JD Edwards.
Optimized performance of long- running data transformations and queries and leveraging Databricks autoloader and delta live tables(DLT) to ensure data quality and monitoring with minimal manual intervention.
Implemented metadata driven frameworks to enable automated and reusable data ingestion processes, particularly from JD Edwards and other ERP systems.
Participated in environment setup and end to end pipeline deployment.
Setting up the orchestration for Stryker China environment in Azure Synapse
Built Soft transformations & converted JDE Datatypes to Databricks datatypes on JDE data for Pacifica project.
Built Automated Data pipeline for loading for Pricing history data for past 60 months in iteation.
Build generic synapse pipelines to move data from JDE to China Azure ADLS.
Worked in development of data mapping in SQL,end to end testing, raw to curated and curated to enrich config development for data ingestion and transformation.
Worked in enhancement in data pipeline which process , evaluated the workflow and increase efficiency
Worked in development of complex queries involving analytic functions and stored procedure to perform data aggregations & for data readiness.
Worked in development of new architecture from scratch and supported in refactoring the existing design implementation, data flow & analysis.
Collaborated with data modeler and product owner to understand the requirement and to develop and deploy solutions to make sure it’s aligned with business requirement
Worked on performance optimization for long running query, stored procedure.
Provided process improvements by automating manual process which used to take long time such as dropping backup tables automatically time to time.
Worked in Ensure for developing Technical validation rules to perform data quality check.
Worked in data validation , analysis.
Actively participated in Requirement gathering, Build, test and deployment hand over activities
Education
M.Tech - Computer Science And Engineering
University Of Calcutta
2020
B.Tech - Computer Science And Engineering
University of Calcutta
2018
Skills
SQL, PySpark, Databricks,Azure Data Factory, Azure Synapse Analytics,Microsoft Purview, Delta Live Tables,Metadata Driven Framework, GCP
Awards
Award: Received recognition for client value creation