Summary
Overview
Work History
Education
Skills
Websites
Certification
Honorsandawards
Skills
Timeline
Generic

Rugved Vengurlekar

Summary

Highly skilled AWS Data Engineer with over 6+ years of experience in designing and implementing data warehouse and Data Lake solutions, cloud migrations, data governance, and data modelling for various clients across the banking, insurance, public sector, pharma, and telecom domains. Proficient in leveraging a range of cloud technologies, including AWS Glue, AWS Step Functions, AWS CloudWatch, AWS S3, AWS Redshift, Snowflake, AWS QuickSight, PySpark, Spark SQL, and Python.L, and Python.. ETL Tools :- AWS Glue. Database/ Data Warehouse :- Redshift, Oracle, SQL Server, My SQL, RDS.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer 2

JP Morgan Chase
05.2024 - Current
  • Designed and Architected Data Pipeline for reading Data from aws nosql database DynamoDb using Glue Job, Lambda. Automated the pipeline using Event Bridge to generate events on daily basis and invoke lambda function to trigger glue job.
  • Generated processed parquet file in S3 bucket after transforming the data using aws glue. Created SNS topic, Lambda to trigger ECS task and loading the data Lake for Analytics.
  • Used Eac(Everything as code) / Terraform to create Infra.
  • Built Devops pipeline using Jenkins, Jules to integrate CI/CD
  • Used Control M to re-trigger lambda in case of failure.
  • Used Git for version control, Jira
  • Integrated Dynatrace, Splunk dashboard to monitor alerts and issues in production.
  • PySpark, Python, SparkSQL for script conversion.
  • Helped Wealth Engagement business to gain insights on Enrolled customers.
  • Connected with Multiple stakeholders to onboard new architecture to production.
  • Worked with product to build datapipline to provide insights for Personalizations customer business.
  • Used Boto3 to connect with DynamoDB, S3 etc.
  • Created Performance benchmarking for Glue, Dynamodb RCU, lambda, s3 cost and runtime with difference Glue processor, Dynamodb RCU configurations with saved AWS services cost by 70% and decreased the datapipeline runtime.
  • Created Global secondary index on Dynamodb to read 70 million records using batch_get_item and query data. Worked on multiple approaches to scan data using multiple segment scan.
  • Guided New Team Members on Deployment and Infrastructure Setup
    Led production deployment processes using Jules, Jet Evidence, CI/CD procedures, and SNOW ticket creation, ensuring compliance and successful implementation.
    Instructed on Control-M setup, including job creation, scheduling, and monitoring, to streamline workflow management.
    Facilitated AWS Lambda deployments and infrastructure creation using EAC, enhancing automation with Jenkinsfile and Jules configuration.
    Guided and helped team on resolving issues related to Control-M jobs and EMR, including debugging and executing Jules deployments.
    Facilitated the local setup of big data tools like Spark and Hadoop, ensuring an efficient working environment.
  • Conducted milestone reviews and enhanced generic data pipeline framework created by Consultants. Conducted comprehensive code and implementation reviews for the generic data pipeline, focusing on the ingestion and testing frameworks.Identified and suggested fixes for issues, enhancing the Extract and Transform phases of the pipeline.
    Proposed and implemented improvements, including date-based folder organization, documentation development on Confluence, and parameterization of queries.
  • Data Publication and Validation from CCBDatalake to Snowflake
    Successfully published data from CCBDatalake to the Snowflake production environment, ensuring it is ready for end-user access.
  • Created Data features to get most recent large deposit for customer using EMR, spark SQL. Helped wealth engagmenet by engaging customers to invest in JP Morgan based on the large deposit received.

Consultant - Senior Data Engineer

Deloitte USI
05.2022 - 04.2024
  • Designed and developed scalable data pipelines and ETL workflows in the pharmaceutical domain using AWS services, ensuring efficient, secure, and compliant data processing aligned with industry regulations.
  • Led the design, development, and deployment of core platform components for a pharma analytics platform to support clinical trial data analysis and regulatory reporting.
  • Built and implemented an end-to-end ETL pipeline using AWS Glue, Glue Crawlers, Lake Formation, Data Catalog, S3, and Athena to ingest, catalog, and query structured and semi-structured datasets including patient records and drug efficacy data.
  • Worked on generating Client reports using Power BI for data analytics
  • Created Data Model for Multi Model dataset by organizing and structuring data from different sources and formats into a unified model that can be used for analysis or reporting
  • Worked and guided resources on data migration project for migrating data from Netezza to Redshift.
  • Worked on AWS Glue, S3, Redshift, Airflow and wrote ETL PySpark scripts to perform transformation and migrate data to Redshift
  • Designed, implemented and automated Audit mechanism process for data loads which reduced time & effort for data validation by 95%
  • Also worked on Spark memory optimization which helped in reducing the glue job run time and job efficiency, hence saving operational cost
  • Collaborated with cross-functional teams, including developers, testers, and stakeholders, to gather requirements, design solutions, and ensure timely project delivery
  • Used BitBucket, Github for code version control and performed code reviews
  • Participated in code reviews, providing technical guidance and feedback to team members, and mentoring junior team members when needed
  • Used Jira for issues and project task tracking

AWS - Data Engineer

Principal Financial Group
09.2021 - 05.2022
  • Worked on AWS Glue, Athena, S3, Cloudwatch, Stepfunctions, Cloudformation, Azure devops
  • Worked on building data pipeline to generate client reports for Insurance domain
  • Have worked on integrating data from multiple sources and transforming data to required format to generate client reports using Tableau
  • Worked with onshore counterpart and clients to get the requirements and implemented the required Data transformations to get the data as required for generating reports
  • Built Data Pipeline using Stepfunctions, S3, Glue, lambda, CDK, Cloudformation CloudWatch, Secret Manager etc to generate client reports
  • Worked on resolving memory issues in spark dataframe on glue caused due too huge size of data

Application Development Analyst

Accenture
08.2019 - 09.2021
  • Leveraged popular Python libraries, including Pandas to manipulate, analyze, and perform scientific computing tasks on large datasets
  • Integrated with cloud services and APIs, such as AWS Boto3, to interact with cloud resources and services programmatically, enabling efficient and automated workflows within cloud-based environments
  • Worked on MySQL, PostgreSQL, NoSQL databases.
  • Written PySpark script to perform data transformation for pharma client
  • Worked on developing mapping using Informatica Powercenter to load data to AWS Redshift.
  • Worked on designing the end-to-end ETL pipeline using AWS glue, Step-Functions, S3, Dynamo Db, RDS, Lambdas etc
  • Contributed fully to design assessments and code reviews
  • Written Unix Shell Scripts to Migrate all the on-premises archive data files to AWS Glacier

Education

Master of Technology - Software Engineering

Birla Institute of Technology and Science
Pilani
06.2023

Bachelor of Engineering - Computer Engineering

Sinhgad College of Engineering
01.2019

Skills

  • Data pipeline design
  • AWS Glue
  • AWS Lambda
  • NoSQL - DynamoDB
  • MySQL
  • Postgres
  • ETL workflows
  • Data transformation
  • Cloud infrastructure
  • Performance optimization
  • SparkSQL
  • Infrastructure as code
  • Data Lake
  • Data Warehousing
  • Snowflake

Certification

  • AWS Certified Developer - Associate, Amazon Web Services (AWS)
  • Microsoft Certified: Azure Data Fundamentals, Microsoft
  • Microsoft Certified: Azure AI Fundamentals, Microsoft
  • Microsoft Certified: Azure Fundamentals, Microsoft

Honorsandawards

  • Applause Award, Deloitte, 08/01/23
  • Zenith Award, Accenture

Skills

  • Data Migration

Timeline

Data Engineer 2

JP Morgan Chase
05.2024 - Current

Consultant - Senior Data Engineer

Deloitte USI
05.2022 - 04.2024

AWS - Data Engineer

Principal Financial Group
09.2021 - 05.2022

Application Development Analyst

Accenture
08.2019 - 09.2021

Master of Technology - Software Engineering

Birla Institute of Technology and Science

Bachelor of Engineering - Computer Engineering

Sinhgad College of Engineering
Rugved Vengurlekar