AMOL JADHAO

Summary

As a seasoned Data Engineer with four years of hands-on experience,I have a comprehensive skill set that covers Snowflake, Amazon Web Services (AWS), Apache Airflow, Spark, Python, and SQL.
Strong experience in building ETL pipelines development using AWS services like S3, Athena, Glue, EMR, Redshift, Snowflake.
Extensive experience in Snowflake for implementing Data Warehouse Solution for downstream consumption.
Good knowledge of AWS services like IAM, S3, EC2, Glue, Redshift, Athena, EMR, RDS.
Executed PySpark scripts within AWS Glue for diverse ETL tasks.
Designed and implemented ETL pipelines using Apache Airflow for efficient job orchestration.
Experienced in building data pipelines, marts, and warehouses, having good knowledge of OLTP, OLAP, Dimension and Fact Tables.
Experienced in working with various Spark components such as Spark Core, Spark SQL, DataFrame API and File Formats like CSV, JSON, Parquet.
Good knowledge of PySpark concepts encompassing Spark Architecture, SparkSubmit, Spark SQL Engine, RDD, Dataframe, Advanced Joins.
Capable of optimizing Spark performance effectively, utilizing tools like Spark Web UI, Spark History Server, and cluster logs to ensure optimal execution.
Extensively worked on SQL for retrieving transformed data and testing it on Athena, Redshift and Snowflake. Proficient in SQL concepts like RDBMS, DDL, DML, DRL Statements, Joins, View, Aggregate and Analytical functions etc.
Familiar with Python concepts like basic data structures like String, List, Set, Tuple, Dictionary, Anonymous function, and used Python for writing DAG in Airflow.
Experienced in Agile methodologies for SDLC, emphasizing organizational deliverables and client management.
Collaborated cross-functionally with business teams and product owners for requirement gathering and analysis.
Participation in daily scrum calls, sprint planning, sprint review, sprint retrospection, Grooming sessions.
Strong technical, communication, analytical, and problem-solving skills.

Overview

years of professional experience

Work History

Data Engineer

LEADSFRONT INDIA

05.2020 - Current

Client:

Description:

Roles and Responsibilities:

Automated data transformation and ETL processes using AWS Glue and Orchestrated the ETL pipeline using Apache Airflow
Utilized AWS Glue for big data processing and analytics to analyze large datasets
Implemented data lake solutions using AWS S3, providing a centralized and scalable data repository
Developed and optimized data processing workflows using Apache Spark, improving speed and efficiency
Optimized data warehousing solutions using Snowflake, enhancing query performance and scalability using query profiling, query history, and optimization techniques in Snowflake
Test case development and transformed data testing using Athena
Proficient in version control using Git for efficient code management and collaboration
Engaged in sprint planning, review, retrospectives, grooming sessions, and peer review processes.

Data Engineer

LEADSFRONT INDIA

Client:

Description:

Roles and Responsibilities:

• Managed data from sources like S3 and worked extensively with various file formats for data extraction
and transformation.
• Developed EMR steps to read and write data at S3 on a scheduled basis.
• Executed PySpark code for transformations to achieve desired data outcomes and monitored jobs
using CloudWatch.
• Transformed and loaded data from S3 to Redshift with custom transformations as per client requests.
• Designed and optimized databases and data schema in AWS Redshift.
• Orchestrated job execution using Airflow for data pipelines scheduling.
• Engaged in sprint planning, review, retrospectives, grooming sessions, and peer review processes.

Data Engineer

LEDSFRONT INDIA

Client:

Description:

Roles and Responsibilities:

• Employed DML operations to modify existing records, add new data, and remove obsolete entries based
on business requirements.
• Defined data structures, enforced data integrity constraints, and optimized database performance
through appropriate indexing strategies.
• Utilized built-in methods and operators to perform operations on different data types, ensuring efficient
data handling and manipulation.
• Designed and implemented custom functions using the def keyword, encapsulating reusable blocks of
code to promote modularity and code reuse.
• Pyspark code optimization and modularizing codes into utilities

Education

Skills

AWS Services: Glue, EMR, Redshift, Athena, S3, EC2, IAM, RDS, SNS, CloudWatch, Lambda
Big Data Technologies: Spark, EMR, Glue, Databricks
Programming languages: Python, SQL, PySpark, SparkSQL
Workflow Management Tool: Airflow
Versioning tool: GitHub

Ticketing Tool: JIRA
Database: Oracle
Data Warehouses: Snowflake, Redshift
Tools: ORACLE PL SQL, Jupyter, PyCharm, VS Code

Personal Information

Title: CLOUD DATA PROFESSIONAL

Timeline

Data Engineer

LEADSFRONT INDIA

05.2020 - Current

Data Engineer

LEADSFRONT INDIA

Data Engineer

LEDSFRONT INDIA