Summary
Overview
Work History
Education
Skills
Timeline
Generic

Sumit Agrawal

Pune

Summary

  • Over all 4+ years of IT experience in Data Engineering, Analytics and Software development.
  • Strong Experience in data engineering and building ETL pipelines on batch using Pyspark, SparkSQL, SQL.
  • Good working exposure on Cloud technolgies of Azure- Blob, ADF, ADLS, Databricks
  • Strong knowledge of Python 3 and experience with programming language - JAVA.
  • Strong experience in Python libraries - Numpy, Pandas.
  • Strong experience in Data Engineering technologies including Hadoop 2, Spark , YARN.
  • Working experience in Java J2EE, Spring HIbernate, RESTful Services.
  • Proficient in performing EDA (Exploratory data analysis), Root Cause Analysis, Impact Analysis on large volume of datasets.
  • Experienced in querying MS SQL server databases for OLTP and OLAP.
  • Solid understanding of RDBMS database concepts including performance tuning and Query optimization
  • Experience in complete Software Development Life Cycle (SDLC) involving Analysis, Design, Development and Testing.
  • Advanced working SQL knowledge and experience working with relational databases, query optimization (SQL) as well as working familiarity with a variety of databases.
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets on building Data Lakes.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions.
  • Strong knowledge of data structures, algorithms, operating systems, and distributed systems fundamentals.
  • Team player with Strong sense of ownership.

Overview

4
4
years of professional experience

Work History

Software Engineer II

Clairvoyant-An EXL Company
Pune
09.2021 - Current
  • Working currently as Data Engineer in development team for sports domain client.
  • Worked on developing ETL pipelines on ADF on data lake using ADLS
  • Performed data transformation on data of client from adobe analytics using Pyspark on databricks platform
  • Responsible for assessing and improving the Quality of Customer Data.
  • Experience with AZURE cloud services: BLOB, ADF, Databricks
  • Analyzed data quality issues through Exploratory data analysis (EDA) using SQL,Python and Pandas
  • Reconciled weekly/monthly reports to ensure regulatory compliance
  • Worked on creating automation scripts leveraging various Python libraries to perform accuracy checks from various sources to target databases
  • Interacting with project stakeholders delivering regulatory reports and to recommend best remediation strategies to ensure pristine quality of high priority usage data elements
    Environment: Python, Databricks, Spark SQL,PySpark, Pandas, Numpy, PowerBI, AZURE BLOB, ADF, ADLS, DATABRICKS.

Software Engineer

Dynamic Dreamz
Surat
08.2020 - 09.2021
  • Various data sources & formats like csv, flat files and RDBMS were used to get data.
  • Worked agile in a team of 4 members and contributed to the ETL pipeline development of application using AZURE ADF
  • Filtration part of junk and bad record was performed through hive scripts
  • Closely monitoring jobs and fixing it on failures
  • Gained expertise in writing SQL queries against MSSQL server with query optimization
  • Performed data analysis using Python and involved in critical problem solving situation and troubleshooting abilities
  • Leveraged various python modules to enhance the data validation, testing and automated daily processes
  • Environment: Python , Spark, MSSQL, SYNAPSE

Java Developer

Emotech Software Solutions Private Limited
Gwalior
07.2018 - 08.2020
  • Worked with Hillsgas client for Gas supply managment System to develop RESTful services
  • Extensive Involvement in Requirement Analysis and system implementation, SDLC phases
  • Contributed to a team of 6 in an agile environment in developing new interfaces for the Hills gas supply application
  • Contributed in developing RESTful services and business logic in the backend of the hillsgas portal to serve requests of several gas cylinder Shipping and Get Kiosk applications on front end using Spring REST , Spring MVC
  • Worked on Object Relation mapping technologies like JPA(Hibernate) to develop the Data Access Layer and Repositories layer
  • Implemented OLTP systems to the backend by creating complex SQL Queries, Functions, Stored procedures using PL/SQL
  • Worked in pair programming, Code reviewing and Debugging
  • Involved in unit test development
  • Involved in UAT and production deployments and support activities
  • Tools & Technologies: Java SE 7, Hibernate, Custom framework,JDBC,Hibernate , Eclipse, GitHub

Education

Master of Computer Applications - Computer Science

Madhav Institute of Technology & Science
Gwalior,M.P
05.2018

Bachelor of Computer Applications - Computer Science And Programming

MCRPV
Gwalior, MP
07.2014

Skills

  • Programming Languages : Python 3, Java, SQL
  • Databases : MYSQL, MSSQL
  • Cloud Technologies : AZURE, BLOB, ADF, ADLS, DATABRICKS
  • Web Technologies: Java J2EE, PHP
  • Interface Development Environment (IDE) : Eclipse, IntelliJ, Anaconda Jupyter notebooks, VScode
  • Operating System : Linux, Windows
  • Apache Spark, Spark SQL
  • Flexible and Adaptable
  • Teamwork and Collaboration
  • Problem-Solving

Timeline

Software Engineer II

Clairvoyant-An EXL Company
09.2021 - Current

Software Engineer

Dynamic Dreamz
08.2020 - 09.2021

Java Developer

Emotech Software Solutions Private Limited
07.2018 - 08.2020

Master of Computer Applications - Computer Science

Madhav Institute of Technology & Science

Bachelor of Computer Applications - Computer Science And Programming

MCRPV
Sumit Agrawal