Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Pawan Kulkarni

Pune,India

Summary


Knowledgeable Data Analyst with robust background in data architecture and pipeline development. Proven ability to streamline data processes and enhance data integrity through innovative solutions. Demonstrates advanced proficiency in SQL and Python, leveraging these skills to support cross-functional teams and drive data-driven decision-making.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Data Engineer

AIT Global India
11.2023 - Current
  • Design and Develop Scalable ETL Pipelines: Create efficient PySpark AWS Glue jobs for processing and transforming large datasets into structured formats. Data Size : Upto 50GB Data at once.
  • Data Preprocessing: Clean and preprocess massive raw data into organized, usable structures to support data-driven decision-making.
  • Optimize Glue Jobs: Continuously enhance existing AWS Glue jobs for improved performance, efficiency, and cost-effectiveness.
  • Automation and Deployment: Implement and manage CI/CD pipelines for Glue jobs using Jenkins, ensuring reliable testing, deployment, and monitoring.
  • Business Logic Implementation: Collaborate with stakeholders to translate business requirements into scalable solutions that align with the current application architecture.
  • Production Monitoring: Monitor Glue jobs in production environments to detect and address any data errors or mismatches promptly.

AI Scientist

Pristine.ai
12.2022 - 08.2023
  • Training and hyperparameter tuning on Random Forest , Time series and Linear Regression algorithms.
  • Feature Engineering and selecting, analysing the best features that will affect and improve the accuracy of the model.
  • Data Exploration , Analysis and pattern finding using R and Microsoft Excel to improve the profit margins for retail companies .
  • Analysing the effects of trends and seasonality on product margins, sales and revenue, ultimately increasing the net revenue by 15% .

Machine Learning Engineer

Cloudmantra
03.2022 - 12.2022
  • Worked on Building scalable ETL pipelines on AWS using Glue , Athena , S3, DMS , AWS Lambda and Quicksight .
  • Monitoring and maintaining Data Pipelines , performing Data validations , Data cleaning and pre-processing.

Data Analyst

Ibexlabs
02.2020 - 03.2022
  • Developed Backend Architecture for an IoT-based web application, deployed for Workspace Optimization. Deliverables included APIs (using Flask) for seamless integration and implementation, databases and data pipelines, onboarding modules with various access levels and user profiling, as well as security modules for password and email authentication/verification.
  • Creation of a system to store, interpret, validate and monitor Powerplant Inverter data as per given thresholds and criteria to purge the concerned plant(s) of any avoidable operational hazards.

Tools Used: AWS S3, DynamoDB File transfer - SFTP ETL - AWS Lambda, Athena, Glue

  • Creation of a web application (CRM) for third party sales companies for sales enablement in Real Estate Industry.

Tools: AWS S3 , AWS Lambda Function, MySQL

Data Science Intern

Arque Tech
09.2019 - 12.2019
  • Cleaned and pre-processed the stock market data using Pandas and numpy for back testing the same.

Education

Bachelor of Engineering - Mechanical

MIT Academy of Engineering
2018

Skills

  • Python
  • R
  • AWS
  • AWS Glue
  • AWS Redshift
  • SQL
  • Pyspark
  • AWS Quicksight
  • AWS Athena
  • AWS Lambda

Certification

  • PGP In Data Science - International School of Engineering
  • Data Engineering with AWS - Udacity

Timeline

Data Engineer

AIT Global India
11.2023 - Current

AI Scientist

Pristine.ai
12.2022 - 08.2023

Machine Learning Engineer

Cloudmantra
03.2022 - 12.2022

Data Analyst

Ibexlabs
02.2020 - 03.2022

Data Science Intern

Arque Tech
09.2019 - 12.2019

Bachelor of Engineering - Mechanical

MIT Academy of Engineering
Pawan Kulkarni