Summary
Overview
Work History
Education
Skills
Technology Proficiency
Accomplishments
Personal Information
Languages
Timeline
Generic
Mangesh Laxmikant Limbekar

Mangesh Laxmikant Limbekar

Parbhani

Summary

AWS Data Engineer with a proven track record in developing scalable data pipelines at Clairvoyant India Pvt. Ltd. Expertise in Terraform and Jenkins CI/CD led to optimized data storage and retrieval, achieving significant performance improvements and cost savings. Strong analytical capabilities and collaboration with cross-functional teams drive successful project outcomes.

Overview

6
6
years of professional experience

Work History

AWS Data Engineer

Clairvoyant India Pvt. Ltd
Pune
11.2021 - Current

# Project 01 - Shutterfly ServiceLog Flattening Pipeline.

  • The purpose of the project is to build a data pipeline to migrate and transform service logs from the Kinesis S3 output folder to an output S3 bucket.
  • The logs were processed and stored in Parquet format, based on application names.
  • The pipeline was deployed and managed in Dev, Preprod, and Prod environments using Jenkins CI/CD, and Terraform.
  • AWS Glue scripts were utilized for data migration and transformation.
  • Developed and deployed an automated pipeline using Terraform and Jenkins.
  • Processed and migrated data from Kinesis S3 output to the final S3 bucket.
  • Utilized AWS Glue scripts to transform and store data in Parquet format, as per app names.
  • Ensured efficient processing and validation using EMR Spark jobs.
  • Optimized data storage and retrieval through S3 lifecycle rules and cost-saving strategies.
  • Queried and analyzed processed data using AWS Athena for reporting and validation.
  • Collaborated with DevOps and Data Engineering teams to enhance pipeline efficiency.
  • Technologies Used: AWS Cloud Services (S3, EMR, Glue, Step Functions, Lambda Functions, Athena), Terraform, Jenkins, PySpark.

AWS Data Engineer

Clairvoyant India Pvt. Ltd
Pune
11.2021 - 06.2023

# Project 02 - Snapfish Image Migration Pipeline

  • Designed and built end-to-end data pipelines for Snapfish image migration using Terraform and Jenkins across Dev, Preprod, and Prod environments.
  • The project involved processing raw image data stored in an S3 bucket, applying transformations, and storing the output in Parquet format.
  • Automated deployment and infrastructure provisioning across Dev, Preprod, and Prod environments using Terraform and Jenkins.
  • Developed and managed scalable data pipelines for Snapfish image migration.
  • Performed validation using EMR Spark jobs, and analyzed the output data using Athena queries.
  • Designed and built end-to-end data pipelines using Terraform and Jenkins.
  • Developed PySpark code for processing and migrating image data.
  • Designed and implemented Glue scripts for data transformation.
  • Performed validation post-migration using EMR Spark jobs.
  • Processed JSON input data into Parquet format for optimized storage and querying.
  • Analyzed output data stored in the S3 bucket using Athena queries.
  • Applied cost optimization strategies to S3 buckets by implementing lifecycle rules based on client requirements.
  • Automated deployment processes using Jenkins CI/CD pipelines.
  • Leveraged AWS services such as Glue, S3, EMR, and Step Functions for efficient data processing.
  • Monitored and optimized the data processing pipeline.
  • Collaborated with cross-functional teams to ensure compliance with data security and performance standards.
  • Evaluated and recommended cloud service providers based on project requirements and budget.
  • Technologies Used: AWS Cloud Services (like Glue, S3, EMR, Step Functions, Athena, and SNS), Terraform, Jenkins, and PySpark.

Data Engineer

Advent Informatics pvt ltd
Pune
11.2018 - 10.2021

# Project 03 - Data Migration to Datalake

  • Import data into the Hive from various relational databases (Oracle, MySQL Server) into HDFS using Sqoop.
  • Write Hive DDL to create a Hive table for optimizing query performance.
  • Ingest flat files, such as delimited, fixed length, etc. Into Hive Warehouse.
  • Monitoring jobs daily, fixing production bugs, and raising tickets for production activity.
  • Engage with BA to understand the requirements clearly.
  • Define all the possible test cases, along with the test data.
  • Designed both managed and external Hive tables, and defined static and dynamic partitions as per requirements for optimized performance on production datasets.
  • Perform optimization techniques in order to enhance the performance of the Sqoop job.
  • Worked with various file formats, like text files, sequence files, ORC files, Parquet files, and various compression formats, like Snappy and bzip2.
  • Written Hive queries for data analysis to meet the business requirements.
  • Implement data validation.

Education

Bachelor's Degree - Mechanical Engineering

Shree Ramchandra college of Engineering
Pune
07.2018

Skills

  • Spark Core
  • Spark SQL
  • PySpark
  • Cloudera
  • Terraform
  • Jenkins CI/CD
  • PyCharm
  • AWS
  • MySQL
  • Linux
  • AWS cloud services
  • CI/CD automation
  • AWS Glue
  • Cost optimization
  • Python
  • HDFS
  • Sqoop
  • Hive
  • Oracle
  • Jira

Technology Proficiency

AWS (Glue, Step Functions, S3, EC2, RDS, IAM, EMR Cluster), Python, Shell Script, HDFS, Sqoop, Hive, Spark Core, Spark SQL, Cloudera, Terraform, Jenkins CI/CD, PyCharm, MySQL, Oracle, Jira

Accomplishments

Accomplishments:

  • Successfully led and completed the Snapfish migration project single-handedly, building an end-to-end data pipeline using Jenkins ,Terraform , AWS Glue and PySpark

Personal Information

Date of Birth: 04/04/93

Languages

  • English
  • Hindi
  • Marathi

Timeline

AWS Data Engineer

Clairvoyant India Pvt. Ltd
11.2021 - Current

AWS Data Engineer

Clairvoyant India Pvt. Ltd
11.2021 - 06.2023

Data Engineer

Advent Informatics pvt ltd
11.2018 - 10.2021

Bachelor's Degree - Mechanical Engineering

Shree Ramchandra college of Engineering
Mangesh Laxmikant Limbekar