Summary
Overview
Work History
Education
Skills
Languages
Timeline
Generic

Sushant Sangale

Bangalore

Summary

  • 5.11+ years of overall extensive IT experience in end-to-end development of software products, from requirement analysis to system study, designing, coding, testing, debugging, documentation, and implementation.
  • 4.10+ years of experience in big data technologies using Spark, Hadoop, MapReduce, Sqoop, Scala, HDFS, Hive, HBase, Kafka, YARN, and Cloudera.
  • Good knowledge of AWS Cloud.
  • Working knowledge of Airflow.
  • Working experience in the MapReduce programming model and Hadoop Distributed File Systems (HDFS).
  • Expertise in Hadoop V2, Spark ETL, Spark Streaming, Hive, data analytics, Cloudera, HBase storage, and Kafka. I worked on projects that exposed me to ELT and Stream Analytics with Spark in Cloudera Hadoop.
  • Experienced with processing different file formats, like Avro, XML, JSON, and Sequence file formats, using Spark.
  • Excellent understanding of Hadoop architecture and different components of Hadoop clusters, which include Name Node, Data Node, YARN, Resource Manager, and Node Manager.
  • Worked on importing and exporting data from different databases, like Oracle and MySQL, into HDFS and Hive using Sqoop.
  • Excellent knowledge of Spark SQL, Spark RDD, Data Frame, and Spark Streaming.
  • Experienced in working with MapReduce Design Patterns to solve complex MapReduce programs.
  • Good knowledge about Hive queries performance tuning.
  • Good knowledge of NoSQL databases, like HBase.
  • Working knowledge of the Python language.
  • Good knowledge of Linux shell scripting.
  • Good communication skills, work ethics, and the ability to work in a team efficiently, with good leadership skills.
  • Ability to learn at a rapid pace, work under strict deadlines, be good at problem solving, and have excellent decision-making skills.
  • Experience in creating tables, partitioning, bucketing, and loading in Hive.

Overview

6
6
years of professional experience

Work History

Senior Data Engineer - Client LeafHome

Retail Kloud9 Technologies India Pvt LTD
Bangalore
02.2025 - Current

L2R at Leaf Home:

Role: Senior Data Engineer.

Project Description: L2R (Lead to Revenue). Current data architecture is complicated and legacy one. Which created DAG dependencies on the Nightly Backup data from Bathplanet SQL servers. As part of the L2R project, the task is to build new ADF pipelines to connect to live SQL servers, and ingest the data to Snowflake to gain business insights on Tableau dashboards.

Responsibilities:

• Development and building new data pipelines as per intake received from TPM.

Snowflake migration with fact and dimension tables, and leverage the lake house features.

• Optimization of stored procedures.

Data Engineer - Client Nike

Retail Kloud9 Technologies India Pvt LTD
Bangalore
06.2022 - Current

1) MANA at Nike Technologies: Experience at Nike: 2 years (Dec 2022 - Feb 2025)

Role: Data Engineer

Project Description: MANA (Marketplace Activation for North America). Current data architecture is complicated and legacy one. Which created DAG dependencies on cross region. As part of the MANA project, the task is to build new pipelines for the NA region, and along with SOLE, implement them. Like lake house implementation and data bricks migration with DELTA tables for stage 1 and stage 2. Some new intake request from stakeholder

Responsibilities:

• Development and building new data pipelines as per intake received from TPM.

• Databricks migration with DELTA write and leverage the lake house features.

• Optimization of code and DAGs.

2) Airflow Migration at Nike Technologies:

Role: Data Engineer

Project Description: A breakdown of the major features incorporated in Apache Airflow 2.0. Including a refactored, highly available scheduler. Over 30 UI/UX improvements. Full REST API, smart sensors, Task Flow API, independent providers. The business requirement is to migrate all data pipelines from Airflow 1.1 to Airflow 2.2.5.

Responsibilities:

Impact analysis on airflow migration.

New MAP cluster developments, as per the mentioned Docker image and image version.

Initial DAG script developments, as per Airflow 2.2.5 requirements. An end-to-end testing of all operators.

Handling three DE teams for all DAGs' migration and testing in the preprod environment and in QA.

All DAGs' deployment in the production environment.

Data Engineer

Datamatics Technosoft Limited
Bangalore
03.2021 - 06.2022
  • The Purpose of this project is to handle and analyse the large amount of data on ecom websites and Application for their Business Growth. As the client was maintaining the data in Oracle RDBMS but as the data was increasing exponentially so we built a data Pipeline for further increasing delta data also.
  • Import Data into the HBase from Relational Database (Oracle) for Historical data and HDF for delta data using Spark.
  • Write Phoenix DDL to create HBase Table.
  • Write Hive DDL to Create Hive Table for Optimize Query Performance.
  • CSV Files like Delimited, Fixed Length, etc. into Hive Warehouse.
  • Involved in importing data from HBase into Hive Managed Tables using Spark which includes Incremental Load and some transformations.
  • Involved in importing data from Hive Managed Table into Hive External Tables which includes Queries using Spark.
  • Designed both Managed and External Hive Tables and Defined static and dynamic partitions as per requirement for optimized performance on production datasets.
  • Create PySpark jobs for importing data into HBase.
  • Create PySpark jobs for data transformation and aggregation.
  • Worked with various File Formats like Text File, ORC Files, Parquet Files and Compression Formats like Snappy.
  • Raise JIRA for Infrastructure, Platform issues.
  • Optimize code using PySpark for better performance.
  • Environment: Hadoop, Hive, HBase, AWS, PySpark.

Trainee Technical Author

Cades Studec Technologies India
Bangalore
08.2019 - 02.2022
  • Worked on live projects as an illustrator/author.
  • Involved in the creation, revision, and updating of Aircraft Technical Manuals (AMM, CMM, IPC, and Service Bulletins) as per Aerospace Technical Publication.
  • Analysing and interpreting the information from engineering drawings, bills of materials (BOM).

Education

Bachelor of Technology - Aeronautical Engineering

Singhania University
Jhunjhunu, Rajasthan
12-2017

Skills

  • Big Data technologies: Spark Core, Spark SQL, Spark Streaming, Hadoop, HDFS, MapReduce, Sqoop, Hive, HBase, Apache Kafka, Snowflake, YARN, Scala, Airflow, Databricks ecosystem, Jenkins
  • Programming languages: Python
  • Hadoop distribution: Cloudera
  • NoSQL databases: HBase
  • Tools: IntelliJ IDEA, SQL Developer, Putty, PyCharm
  • Cloud: AWS (EC2, S3, RDS, EMR, Glue, MSK, Lambda), Azure (ADF, blob, Azure DevOps)
  • Build automation tool: Apache Maven, SBT
  • Project management: Jira, Git, Azure DevOps
  • Databases: Oracle 11g, MySQL, Postgres
  • Operating systems: Windows, Linux

Languages

Marathi
First Language
Hindi
Proficient (C2)
C2
English
Proficient (C2)
C2

Timeline

Senior Data Engineer - Client LeafHome

Retail Kloud9 Technologies India Pvt LTD
02.2025 - Current

Data Engineer - Client Nike

Retail Kloud9 Technologies India Pvt LTD
06.2022 - Current

Data Engineer

Datamatics Technosoft Limited
03.2021 - 06.2022

Trainee Technical Author

Cades Studec Technologies India
08.2019 - 02.2022

Bachelor of Technology - Aeronautical Engineering

Singhania University
Sushant Sangale