Summary
Overview
Work History
Education
Skills
Timeline
Generic

Ankit Jaiswal

Data Engineer
Ghaziabad

Summary

Results-driven Data Engineer with over 10 years of experience designing and implementing large-scale, scalable data engineering solutions using Google Cloud Platform (GCP) and modern DevOps practices. Expertise in BigQuery, Cloud Orchestration, Cloud Storage, Dataflow, Dataproc, and Airflow, combined with advanced proficiency in Python, SQL, PL/SQL, and ETL pipelines. Skilled in data warehousing, data migration, and integrating DevOps tools like Git, Jenkins, Docker, Kubernetes and Rest API to automate and streamline data workflows.

Overview

11
11
years of professional experience

Work History

Data Engineering Manager

Ernst & Young
11.2022 - Current

Internal Audit Analytics Generative AI Analytics

  • Led a team of 6 to deliver scalable data engineering and migration solutions for financial institutions, utilizing GCP services like BigQuery, Dataflow, and Cloud Storage for efficient data processing and reporting.
  • Designed and automated ETL pipelines with Python, Pyspark and GCP Dataproc, leveraging Apache Airflow on GCP Cloud Composer to improve workflow efficiency and reduce manual efforts.
  • Enhanced data accessibility by creating centralized repositories on GCP Cloud Storage and implementing robust data models in BigQuery for comprehensive lifecycle tracking and lineage.
  • Improved testing efficiency by 80% through automated data profiling with Python and BigQuery ML and streamlined deployments with CI/CD pipelines using Jenkins, Kubernetes, and GCP Cloud Build.
  • Integrated GCP-native tools like Data Catalog for data governance and developed secure handling policies for PII with GCP IAM, ensuring compliance and data quality.
  • Effectively managed project documentation and issue tracking using Confluence for transparency and JIRA to track project progress and team collaboration.
  • Spearheaded the development of a Generative AI-powered real-time chatbot using a Retrieval-Augmented Generation (RAG) pipeline, enhancing compliance, sales, and internal audit workflows, with a 30-40% improvement in operational efficiency.
  • Created specialized RAG pipeline chatbots for regulatory compliance (real-time document insights), sales optimization (improved conversations and conversion rates), and internal audit (streamlined processes).
  • Developed a SMART (Social Media Analysis Reporting Tool) leveraging sentiment analysis with Hugging Face models to analyze platforms like Facebook, Twitter, and Instagram, delivering actionable insights.
  • Enabled process improvements and achieved a 65% efficiency increase for operations teams by identifying key public sentiment trends and implementing data-driven recommendations.

Senior Application Developer

Oracle
04.2021 - 11.2022
  • Collaborated with clients to gather requirements, conduct system analysis, and finalize technical and functional specifications for seamless migration of on-premise data to GCP BigQuery and Cloud Storage using Python.
  • Automation and Orchestration: Automated data migration processes with Python scripts and orchestrated workflows using Apache Airflow, significantly reducing manual effort and improving operational efficiency.
  • Database Optimization: Executed and optimized database operations using SQL and PL/SQL, enhancing data workflows, query performance, and storage systems within the GCP ecosystem.
  • CI/CD Implementation: Designed and implemented CI/CD pipelines with Jenkins, Docker, and Kubernetes, ensuring automated deployment, continuous integration, and smooth application delivery in GCP environments.
  • Post-Migration Support & Collaboration: Oversaw testing, validation, and deployment of applications on GCP, provided post-deployment maintenance to ensure high system availability, and collaborated with cross-functional teams to meet project deadlines and client expectations.
  • REST API Integration: Designed and deployed REST APIs using Python, enabling seamless communication between migrated systems and other cloud services.

Senior Associate

Ameriprise Financial
05.2019 - 04.2021
  • Developed and managed ETL workflows for trade and portfolio data processing using Python libraries like Pandas and NumPy, combined with GCP Dataflow for scalable and efficient data transformation.
  • Automated manual tasks using Python, Shell scripts, and GCP Cloud Composer (Airflow), enhancing workflow efficiency and scalability, while scheduling pipelines with Tivoli Workload Scheduler (TWS).
  • Performed complex database operations and query optimizations using SQL Server and PL/SQL, ensuring seamless data transformations and improved database performance.
  • Migrated and modernized data pipelines to GCP, leveraging BigQuery for efficient processing, storage, and integration of trade-related data from platforms like Aladdin, enabling advanced analytics and decision-making.
  • Provided end-to-end production support by resolving pipeline issues, optimizing workflows, and leveraging GCP operations tools and ServiceNow for incident management to ensure minimal downtime and reliable operations.

Associate Consultant, Technology

SAAMA TECHNOLOGIES (I) LTD
03.2018 - 05.2019
  • Successfully developed and implemented two data warehouses in the Life Sciences domain within 12 months, leveraging GCP BigQuery and Cloud Storage for efficient data storage and analysis of critical business data.
  • Designed and optimized ETL pipelines using Python libraries like Pandas and NumPy to integrate data from multiple sources into Oracle and MS SQL databases, enabling seamless data transformation and storage.
  • Utilized SQL, PL/SQL, and Unix scripting to streamline data workflows, automate repetitive tasks, and enhance the robustness of data transformation processes.
  • Collaborated with stakeholders to design scalable and efficient data models, improving organizational reporting capabilities and enhancing accessibility to critical data.
  • Earned the "Shining Star" award twice for exceptional performance and significant contributions to data engineering initiatives, demonstrating technical expertise and impactful results.

Associate Consultant, Technology

Capgemini
06.2014 - 02.2018
  • Specialized in data integration and transformation using Informatica PowerCenter, implementing complex workflows and data pipelines to support critical business operations.
  • Designed and implemented CDC processes for the Barclays client at Capgemini, leveraging Informatica PowerCenter to track and efficiently manage data changes in real-time.
  • Utilized Java Transformations and SQL Loader within Informatica PowerCenter 10.1 to process client data, including handling CLOB columns, ensuring seamless ingestion and storage.
  • Collaborated with Morgan Stanley to enhance data security by masking sensitive client data using Informatica versions 9.6 and 10.1, ensuring compliance with data governance standards.
  • Gained expertise in optimizing ETL workflows with Python and SQL, maintaining data accuracy, and ensuring the reliability of high-quality data pipelines for critical business functions.

Education

Master of Computer Applications -

National Institute of Technology
Durgapur, India
04.2001 -

Skills

Python

Timeline

Data Engineering Manager

Ernst & Young
11.2022 - Current

Senior Application Developer

Oracle
04.2021 - 11.2022

Senior Associate

Ameriprise Financial
05.2019 - 04.2021

Associate Consultant, Technology

SAAMA TECHNOLOGIES (I) LTD
03.2018 - 05.2019

Associate Consultant, Technology

Capgemini
06.2014 - 02.2018

Master of Computer Applications -

National Institute of Technology
04.2001 -
Ankit JaiswalData Engineer