Summary
Overview
Work History
Education
Skills
Certifications Training
Certification
Timeline
Generic

Swapnil Solanke

Aurangabad

Summary

3.10 years of IT experience as a Big Data Developer and Data Engineer with different roles including development, implementation, and support of applications. Good experience with data ingestion using Spark transformations, ETL, Spark SQL, and performance tuning. In-depth understanding of Spark architecture, including Spark Core, Spark SQL, and data frames. Involved in the data ingestion pipeline for sourcing daily data to AWS S3 using PySpark. Experience working on Agile/Scrum methodology. Experience in writing Spark applications using PySpark in Python, and data analysis Python library Pandas. Innovative data scientist with a robust background in machine learning, statistical analysis, and predictive modeling. Skilled in translating complex datasets into actionable insights that drive decision-making and business strategy improvements. Demonstrates strong problem-solving abilities and mastery of Python, R, SQL, and data visualization tools. Previous work has led to significant enhancements in operational efficiency and revenue growth through data-driven strategies.

Overview

4
4
years of professional experience
1
1
Certification

Work History

Data Engineer

Terminus info solutions pvt.ltd
Remote
06.2023 - 02.2025
  • Company overview: Product base.
  • Project: Automation and ETL of data.
  • Tools and Technologies: AWS Glue, PySpark, S3, SQL, Snowflake, IAM, Flowchart (MIRO).
  • Engineered data transformation processes with AWS Glue.
  • Developed Spark jobs for efficient data processing.
  • Developed and executed SQL queries for data extraction.
  • Completed various spontaneous assignments promptly.
  • Created flow charts to enhance high-level design documentation.
  • Enhanced efficiency by automating organization's manual processes.
  • Crafted efficient query scripts using Snowflake.

Big Data Developer

Zielotech software PVT. LTD
Pune
10.2020 - 06.2023
  • Company Overview: E-commerce.
  • Project: Data Lake.
  • Tools & Technologies: Oracle, PySpark, Python, AWS DMS, S3, AWS Glue, Redshift.
  • 2.5 years of IT experience as a Big Data Developer Job role with different roles including Development, implementation, and support of applications
  • Good experience with Data Ingestion using Spark Transformations, ETL, Spark-SQL, and Performance tuning
  • In-depth understanding of Spark Architecture including Spark-Core, Spark-SQL, and Data-frames
  • Involved in data ingestion pipeline for sourcing daily data to AWS S3 using PySpark
  • Experience working on an Agile/Scrum Methodology
  • Experience in writing Spark applications using PYSPARK in Python & data analysis Python library Pandas
  • Attending a daily scrum call with team members and discussing the task on a particular sprint
  • In every sprint, we have assigned tasks that we are completing as per the timeline
  • Weekly call with a client for their requirements and further enhancement
  • Analyzing PROD tickets - Issues can be data-related or code related
  • Import data from the S3 raw bucket and do the cleaning and masking and again move to s3 cleanse bucket by using glue
  • Creating a DDL script for creating tables in redshift
  • Work with the teammates to understand and support them in importing data from source to S3 by using AWS DMS
  • E - commerce
  • Project: Data Lake
  • Tools & Technologies: Oracle, Pyspark, Python, AWS-DMS, S3, AWS-Glue, Redshift

Education

Bachelor of Engineering -

MgM's Jawaharlal Neharu Engineering Collage
Aurangabad, Maharashtra, IN
06-2020

Skills

  • Data Engineer
  • Pyspark
  • AWS
  • ETL
  • DATA
  • PySpark programming
  • Spark programming
  • SQL
  • MapReduce
  • Big Data
  • EMR
  • Glue
  • RDBMS
  • HDFS
  • Hadoop
  • Spark
  • Shell scripting
  • Spark Core
  • Spark SQL
  • Spark UI
  • AWS (S3, RDS, IAM, EMR, EC2)
  • Oracle DB

Certifications Training

  • Hive Hands on - Great Learnings, 10/01/20
  • Data analysis with Pyspark - Great Learning, 10/01/20
  • Ultimate AWS Certified Cloud Practitioner - Udemy, 10/01/20
  • Databricks Accredited Lakehouse Fundamentals - Databricks, 10/01/20

Certification

  • Hive Hands on - Great Learnings
  • Data analysis with Pyspark -Great Learning
  • Ultimate AWS Certified cloud practitioner - Udemy
  • Databricks Accredited Lakehouse Fundamentals - Databricks

Timeline

Data Engineer

Terminus info solutions pvt.ltd
06.2023 - 02.2025

Big Data Developer

Zielotech software PVT. LTD
10.2020 - 06.2023

Bachelor of Engineering -

MgM's Jawaharlal Neharu Engineering Collage
Swapnil Solanke