Summary
Overview
Work History
Education
Skills
Certification
Languages
Awards Recognition
Timeline
Generic

Sudarshan Satpute

Pune

Summary

Immediate Joiner. Results-driven Data Engineer with over four years of experience in designing and implementing large-scale Big Data processing pipelines. Demonstrated expertise in managing over 100 terabytes of data for enterprise-grade analytics and reporting. Proficient in a range of Big Data technologies, including Hadoop, PySpark, Hive, Sqoop, Databricks, and SQL, with a proven track record in data ingestion, transformation, optimization, and performance tuning. Committed to delivering high-quality solutions that enhance data accessibility and drive informed decision-making. Prepared to contribute expertise to dynamic teams seeking innovative approaches to data engineering challenges.

Overview

4
4
years of professional experience
1
1
Certification

Work History

Data Engineer

Tata Consultancy Services
01.2023 - Current
  • As a Data Engineer, I developed and optimized Spark-based data processing pipelines for validation, transformation, and aggregation in alignment with client requirements.
  • Collaborated with SMEs, Adobe Architects, and QA teams to ensure smooth delivery of high-quality, scalable data solutions.
  • Built scalable and high-performance data processing solutions, improving data reliability and pipeline efficiency.
  • Implemented partitioning and bucketing strategies in Hive and PySpark, achieving a 30% improvement in data processing time and query performance.
  • Optimized data storage and retrieval by selecting efficient file formats (Avro, Parquet, ORC) and applying compression techniques like Snappy, LZO, GZIP, and BZIP2, resulting in 15% storage optimization and 40% faster data reads.
  • Applied various Spark transformations, actions, and performance tuning techniques such as repartitioning, coalescing, caching, broadcast joins, shuffle sort-merge joins, and cluster-level optimizations to enhance overall data pipeline performance.

Data Operations

Tata Consultancy Services
07.2021 - 12.2022
  • As part of the Data Operations team, I managed and maintained Hadoop-based dataflow, ensuring data accuracy and consistency across downstream applications.
  • Monitored, managed, and optimized Hadoop data pipelines to ensure smooth data operations for Nordea Bank's critical systems.
  • Streamlined data migration processes during system upgrades, ensuring minimal impact on daily operations.
  • Utilized Big Data Hadoop, Spark, and SQL expertise to identify and resolve operational issues affecting dataflow.
  • Analyzed complex code logic and data inconsistencies, identifying root causes and implementing long-term corrective actions.
  • Collaborated with stakeholders and engineering teams to communicate issues, document detailed Root Cause Analysis (RCA), and propose effective resolutions or workarounds.
  • Ensured data quality and consistency across multiple systems by proactively detecting anomalies and escalating upstream data discrepancies to concerned teams.

Education

Bachelor of Engineering - Mechanical Engineering

Sant Gadge Baba University
Amravati, Maharashtra
07.2021

Skills

  • Python
  • PySpark
  • Scala
  • SQL
  • HDFS
  • Apache Spark
  • Apache Hive
  • Sqoop
  • MapReduce
  • HBase
  • YARN
  • Azure Data Lake Storage (ADLS Gen2)
  • Azure Synapse Analytics
  • Azure SQL Database
  • Azure SQL Data Warehouse
  • Azure Cosmos DB
  • Azure Data Lake
  • Azure Data Factory (ADF)
  • Amazon EMR
  • AWS Glue

Certification

  • Microsoft Certified: Azure Fundamentals (AZ-900)
  • Google Associate Cloud Engineer
  • Hadoop and Big Data Developer Certification (Udemy)

Languages

English
Hindi
Marathi

Awards Recognition

  • 5× On-the-Spot Awards - Received multiple on-the-spot recognitions for outstanding performance, quick problem-solving, and timely delivery of high-quality data solutions.
  • 2× Best Team Awards - Honored as part of the best-performing team for exceptional collaboration, innovation, and successful delivery of large-scale data engineering projects.

Timeline

Data Engineer

Tata Consultancy Services
01.2023 - Current

Data Operations

Tata Consultancy Services
07.2021 - 12.2022

Bachelor of Engineering - Mechanical Engineering

Sant Gadge Baba University
Sudarshan Satpute