Summary

Overview

Work History

Education

Skills

Certification

Timeline

PROJECTS & ACHIEVEMENTS

KEY COMPETENCIES

ANUP SIGEDAR

Pune

Summary

Results-driven Data Engineer and ETL Developer with 5+ years of experience designing, developing, and optimizing large-scale data pipelines and ETL workflows. Proven expertise in migrating legacy systems to modern cloud platforms (AWS, Azure) and converting code to PySpark on Hadoop and Databricks environments. Strong background in data integration, warehousing, and business intelligence with proven ability to improve data processing performance and ensure data quality across enterprise applications.

Overview

years of professional experience

Certification

Work History

Senior Software Engineer

IMPETUS TECHNOLOGIES INDIA PVT. LTD.

12.2022 - Current

Foresters Finanacial

- Worked on Migration of Informatica Workflows to Azure Databricks Workflows by converting the code to PySpark and processing 10+ million financial data by improving the data load performance by 40%.

- Performed Data Validation and File Validation with respect to the Source Informatica Tables and Files resulting 99% accuracy.

Banco General

- Migrated the legacy code of Sybase Database to PySpark on Azure Databricks platform, processing 10+ million records data and doing the data validation by giving 98% accuracy.

Customer Relationship Hub (CRH) – Bank of America

- Worked on migration project converting Vertica and DataStage code to PySpark on Hadoop environment, reducing query execution time by 40%

- Engineered and executed Spark 2 to Spark 3 upgrade with comprehensive testing, ensuring zero data loss and 100% compatibility across all modules

- Designed and implemented data validation framework ensuring data integrity across DataStage and PySpark workflows with 99.9% accuracy rate

- Managed Autosys job scheduling for converted PySpark pipelines, successfully scheduling and monitoring 50+ daily ETL jobs

- Collaborated in Agile environment using JIRA, delivering weekly sprints with zero critical bugs in production

Capgemini Vanguard Migration Project

- Orchestrated migration of DataStage workflows to PySpark, creating 50+ AWS Glue jobs for data pipeline replication

- Performed comprehensive data validation comparing DataStage and PySpark outputs, identifying and resolving 25+ data inconsistencies

- Reduced data processing time by 50% through PySpark optimization and parallel processing techniques

USPS (United States Postal Services) Migration Project

- Migrated Teradata (Bteq) and Ab Initio workflows to PySpark on Azure Databricks platform.

- Designed ETL pipelines handling 10+ million records daily with 99.95% success rate

- Implemented data quality checks and error handling mechanisms, reducing post-processing issues by 35%

Leap Logic Tool Project

- Converted Teradata (Bteq) queries to optimized PySpark code, improving query performance by 45%

- Developed and deployed 15+ AWS Glue jobs for automated data ingestion and transformation

Associate R&D Engineer – Data Engineering

ABB INDIA LIMITED

03.2020 - 12.2022

Genix Ability Analytics Suite Product Development
- Architected and developed optimized data ingestion pipelines serving 5+ enterprise applications, processing 500GB+ monthly data volume
- Engineered data integration for Opportunity Loss Manager and System Anomaly Detection applications using Azure Data Factory, reducing data load time by 60%
- Designed multi-source data integration consolidating Azure SQL Database, Azure Cosmos DB, and Azure Data Lake, ensuring unified data governance
- Implemented Proof of Concept for Talend ETL tool evaluation, documenting integration patterns and performance benchmarks
- Successfully deployed Docker-containerized Flask microservices on Azure Kubernetes Services (AKS), exposing REST APIs for 8+ data applications
- Independently managed Rundeck scheduler application supporting 12+ teams, scheduling and monitoring 200+ REST API calls daily with 99.8% uptime
- Led data validation and debugging for complex ETL flows, identifying and resolving 40+ data quality issues across production pipelines
- Collaborated with international teams (Italy) on ENEL Project data loading activities, ensuring compliance with strict SLAs

Intern – Smart City Automation

TECHBEAN SYSTEMS PVT. LTD.

06.2018 - 11.2018

- Contributed to IoT-based smart city automation projects, gaining foundational knowledge in data collection and integration

Education

Post Graduate Diploma - Big Data Analytics

Centre for Development of Advanced Computing (C-DAC)

Bengaluru, Karnataka, India

02.2020

Bachelor of Technology - Electronics and Telecommunications

Symbiosis Institute of Technology

Pune, Maharashtra, India

03.2019

Post Graduate Diploma - Business Management

Symbiosis Institute of Business Management

Pune, Maharashtra, India

01.2017

Skills

Programming Languages: Python, SQL, Shell Scripting
Database & Data Management: SQL Server, Azure SQL Database, Azure Cosmos DB, Azure Data Lake, Teradata, Vertica, Data Integration, Data Warehousing, Extract Transform Load (ETL), Job Scheduling
Big Data & Processing: PySpark, Hadoop (HDFS), Hive, Apache Spark
Python Libraries: Pandas, NumPy

Cloud Platforms: Amazon Web Services (AWS Glue, S3, Lambda, EMR), Microsoft Azure (Azure Databricks, Azure Data Factory, Azure Kubernetes Services, Azure Data Lake)
Tools & Frameworks: Apache Airflow, Tableau, Power BI, Apache Rundeck, Talend, Ab Initio, DataStage, Autosys, JIRA, Docker, Kubernetes, Flask
Data Science: Machine Learning, Deep Learning, Convolutional Neural Networks (CNN)

Certification

Databricks Certified Data Engineer Associate
Business Intelligence with Power BI – Skill Nation
Analyst Program – Able Jobs
Cricket Analytics – Mad About Sports
SQL Intermediate Certificate – HackerRank

Timeline

Senior Software Engineer

IMPETUS TECHNOLOGIES INDIA PVT. LTD.

12.2022 - Current

Associate R&D Engineer – Data Engineering

ABB INDIA LIMITED

03.2020 - 12.2022

Intern – Smart City Automation

TECHBEAN SYSTEMS PVT. LTD.

06.2018 - 11.2018

Bachelor of Technology - Electronics and Telecommunications

Symbiosis Institute of Technology

Post Graduate Diploma - Business Management

Symbiosis Institute of Business Management

Post Graduate Diploma - Big Data Analytics

Centre for Development of Advanced Computing (C-DAC)

PROJECTS & ACHIEVEMENTS

Projects

Data Analytics Projects | Python, SQL, Tableau, Excel

- Expense Tracker Application (Excel with Macros), - Stock Market Performance Analysis and Forecasting (Python)

- Data Science Job Salaries Analysis and Visualization (Python)

- Sales Data Analysis and Reporting (SQL)

- British Airways Dashboard with Business Intelligence Insights (Tableau)

- Video Games Market Dashboard and Trend Analysis (Tableau)

KEY COMPETENCIES

ANUP SIGEDAR

Summary

Overview

Work History

Senior Software Engineer

Associate R&D Engineer – Data Engineering

Intern – Smart City Automation

Education

Post Graduate Diploma - Big Data Analytics

Bachelor of Technology - Electronics and Telecommunications

Post Graduate Diploma - Business Management

Skills

Certification

Timeline

Senior Software Engineer

Associate R&D Engineer – Data Engineering

Intern – Smart City Automation

Bachelor of Technology - Electronics and Telecommunications

Post Graduate Diploma - Business Management

Post Graduate Diploma - Big Data Analytics

PROJECTS & ACHIEVEMENTS

KEY COMPETENCIES

Similar Profiles

B. Durga Prasad YadavB. Durga Prasad Yadav

CHENNAKESAVULU GADDAMCHENNAKESAVULU GADDAM

Vishal KumarVishal Kumar

Gresha GoswamiGresha Goswami

PALLAVI RANIPALLAVI RANI