Summary
Work History
Education
Skills
Languages
Work Preference
Generic
Shubham Thakur

Shubham Thakur

Big Data Engineer
Pune,MH

Summary

Highly skilled Big Data Engineer with 4.9 years of experience in designing, implementing and managing large-scale data processing systems. Strengths include extensive knowledge of big data tools and technologies, strong analytical skills, and ability to troubleshoot complex problems effectively. Proven capability to deliver impactful solutions that have improved system performance and efficiency in previous roles.

Work History

Big Data Engineer

Cognizant Technology Solutions Corporation
Pune
12.2022 - 08.2024
  • Optimized Data Pipelines using Delta cache, Adaptive Queries, Partitioning, Bucketing, etc.
  • Collaborated with stakeholders for requirements gathering and understanding and worked in Agile Methodology to develop a Data Engineering model.
  • Designed End to End Pipeline using Azure Data Factory and Azure Databricks for Lending project.
  • Created Data Quality Checks and automatic notifications on detecting malicious data using Azure Data Factory and Azure Databricks.
  • Strengthened POCs to set up environment using Azure Code Deploy on Databricks and ADLS Gen2 environments.
  • Conducted performance tuning on PySpark code, optimizing execution time by 20%.
  • Achieved proficiency in PySpark, Azure Data Factory, Databricks, ADLS Gen2, Delta Lake, and other Azure services.

Azure Data Engineer

Cognizant Technology Solutions Corporation
Pune

View 27.

  • This project aims to create 27 views by 2027. Developed Loans View, Customer Accounts View, and Customer Marketing Solicitation View end to end.
  • Initiated automated Data Ingestion from DB2 to ADLS Gen2, Data Processing using Azure Databricks, and then storing results to ADLS Gen2 and Azure SQL Database as per requirement.

Customer Segments Analysis.

  • This project was intended to find the premier customers in all the segments of Savings, Current, Loan, and OD accounts.
  • Implemented end-to-end automated data pipeline from ingestion to processing using PySpark and Azure Cloud.

AWS Data Engineer

Anarock Property Consultants
Pune
07.2019 - 06.2022
  • Worked on the configurations of the Data Transformation Tool. Created the connection objects (connecting to the source and target) datasets (where the data actually resides) and at last worked on the mapping part in DT tool.
  • Prepared the mapping in DT tool where connected multiple sources together and applied some transformation logic, like join and applying expressions such as string function, numeric function, and constants. Also worked on some other logics available in DT tool.
  • Created the storage system for loading the data from SQL to S3. Prepared the source feeds for loading the data. And worked on the source and feed system for creating, viewing, and editing source system and source feeds in the Metadata Repository.
  • Created the Step Function for source tables to full extraction, deduplication, and Upsert triggers. And their respective glue jobs were prepared in glue.

Education

MBA - Banking And Finance

Narsee Monjee Institute of Management Studies
Mumbai
07.2020 - 07.2022

BBA - Marketing

Marathwada Mitramandal College of Commerce
06-2018

Skills

  • Apache Spark
  • Databricks
  • Azure Data Factory
  • Azure
  • AWS
  • Pyspark
  • Python
  • SQL
  • Hadoop

Languages

Hindi
First Language
English
Upper Intermediate (B2)
B2
Marathi
Proficient (C2)
C2

Work Preference

Work Type

Full TimePart Time

Work Location

On-SiteRemoteHybrid
Shubham ThakurBig Data Engineer