Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Timeline
Generic
Mayur Gupta

Mayur Gupta

Bangalore

Summary

Data professional with 3+ years of experience, skilled in Python, PySpark, and ETL processes. Proficient in cloud services (AWS, Azure) and data visualization (Power BI). Adept at data analysis, engineering, and science, with a strong track record of leveraging technology to deliver actionable insights and streamline data operations. Seeking to contribute expertise in data-driven roles.

Overview

4
4
years of professional experience
1
1
Certification

Work History

Associate Consultant

KPMG Global Services
Bangalore
04.2022 - Current

ETL and Data Quality for US-Based Asset Management Firm

  • Objective: Enhance data quality management and implement ETL processes.
  • Tools: Informatica Data Quality, PyDeequ, Python, AWS Glue, PySpark, Alteryx.
  • Key Achievements:Implemented 100+ data quality rules using Informatica Data Quality and PyDeequ, resulting in a 30% reduction in data inconsistencies.
    Conducted in-depth data profiling using IDQ and Python pandas profiling, leading to the identification of critical data quality issues and the creation of business-approved DQ rules.
    Developed ETL mappings in Informatica to filter and log outlier data, enhancing data accuracy and enabling the creation of comprehensive DQ dashboards.
    Automated data comparison and reconciliation processes with Alteryx, increasing data validation efficiency by 40%.

Automated Data Quality Framework Implementation Using AWS for US Client.

  • Objective: Developed a Proof of Concept (POC) for a US-based client to automate data quality checks using AWS services, improving data integrity and compliance.
  • Tools: AWS PyDeequ, AWS Lambda, AWS Glue, S3, PySpark, Python.
  • Key Achievements: Designed and implemented a user interface (UI) allowing users to create data quality rules, which are stored in a backend table named data_quality_rules. Data Ingestion & Processing:Automated the process of uploading datasets to S3, triggering an AWS Lambda function.
    Integrated an AWS Glue pipeline with multiple jobs:Job 1: Detected dataset file types (e.g., Parquet, CSV, XLSX) and dynamically created PySpark DataFrames.Job 2: Queried the data_quality_rules table and applied PyDeequ constraint verification to validate the rules against the dataset. Stored results indicating the success or failure of each rule.Job 3: Queried verification results and automatically notified data stewards of any failed rules for further investigation.
    Enhanced data quality management with automated checks, reducing manual intervention by 80%. Improved response time for data quality issues by 60%, leading to faster decision-making.

Cloud Migration and Analytics for Retail Company

  • Objective: Migrate on-premises data to the AWS cloud platform and develop analytics dashboards.
  • Tools: AWS Glue, SQL, Power BI, and AWS Lambda.
  • Key Achievements:Designed and executed a cloud migration strategy, successfully migrating 500GB+ of data to AWS, supporting the company’s global expansion.
    Created and optimized data pipelines for seamless migration, improving data processing speed by 25%.
    Developed Power BI dashboards to visualize sales KPIs, providing actionable insights that contributed to a 15% increase in sales.

Data Product Accelerator for US-Based Firm

  • Objective: Streamline ETL processes and develop data product accelerators.
  • Tools: Python, Pandas, PySpark.
  • Key Achievements: Developed data product accelerators that reduced ETL code-writing time by 50%, significantly improving productivity.
    Implemented ML-based data anomaly detection, reducing data validation errors by 20%.
    Automated data validation processes with Python, enhancing data integrity and reducing manual intervention.

Data Analyst

Clicksco
Hyderabad
12.2021 - 02.2022
  • Analyzed marketing and advertising data using Python, SQL, and BigQuery to generate comprehensive business reports.
  • Developed Python scripts for dynamic report generation, leveraging pandas for efficient data aggregation and summarization.
  • Created visualizations like line plots, bar charts, and trend charts to enhance data understanding and support decision-making.

Data Analyst Intern

Intugine Technologies
Bengalore
09.2020 - 06.2021
  • Enhanced decision-making for operational and leadership teams by analyzing data and presenting insights through interactive graphs, charts, and comprehensive reports.
  • Developed Python scripts using Pandas, NumPy, MongoDB, and Matplotlib, resulting in a 25% increase in data processing efficiency.
  • Created customer satisfaction reports from chat data, improving customer feedback analysis by 30%.
  • Developed a REST API with Flask and AWS Serverless API Gateway to enable dynamic report generation, reducing manual report generation time by 40%.
  • Automated daily, weekly, and monthly report generation in complex multi-sheet formats using AWS Cron Jobs and Lambda functions, saving over 15 hours of manual work weekly.
  • Monitored and maintained data quality by identifying and removing corrupt data, leading to a 20% improvement in data accuracy.

Education

PGPDSE -

Great Lakes Institute of Management
Chennai, TN
12.2021

Bachelor's in Computer Engineering -

Lovely Professional University
Jalandhar, PB
06.2021

Skills

  • Data Analytics
  • Cloud Technologies
  • AWS Services
  • SQL
  • AWS Glue
  • Pyspark
  • Databricks
  • Data Science
  • Python
  • AWS Lambda
  • Data Quality
  • IDQ
  • PowerBI
  • Azure
  • Airflow
  • Alteryx

Certification

  • SQL(Intermediate)Certificate, Hackerrank
  • AWS Cloud Practioner(CLF-C02), Amazon
  • PL-300 Power BI Analyst
  • Alteryx Micro Credential

Accomplishments

  • Recognized as "Most Valuable Employee" for outstanding performance and contributions.
  • Consistently praised by clients and stakeholders for delivering high-quality results.

Timeline

Associate Consultant

KPMG Global Services
04.2022 - Current

Data Analyst

Clicksco
12.2021 - 02.2022

Data Analyst Intern

Intugine Technologies
09.2020 - 06.2021

PGPDSE -

Great Lakes Institute of Management

Bachelor's in Computer Engineering -

Lovely Professional University
Mayur Gupta