Summary
Overview
Work History
Education
Skills
Websites
Certification
Additional Information
Languages
Timeline
Generic
Abhishek Ojha

Abhishek Ojha

Bengaluru

Summary

Reliable Data Engineer with expertise in data collection, organization, and cloud services, particularly skilled in Python scripting, AWS Services, Terraform, PySpark, Hudi, and SQL. Experienced in transforming data into functional formats to improve efficiency and ROI. Proficient in automating business processes and cloud integrations. Eager to learn and explore new technologies while contributing to agile projects for continuous skill enhancement.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer

Quicken
11.2021 - Current
  • Developed a centralized automated orchestration system for monitoring and processing sensitive data across all databases in the data lake. Used AWS Step Functions (state machine) with advanced 'map of map' properties to run PySpark scripts on EMR for each table independently, regardless of success or failure. Implemented Lambda to prepare payloads for orchestrating EMR jobs across different applications and databases. Managed Hudi tables according to Hudi standards and sent cross-account notifications on job success or failure to ensure completion updates for other teams.
  • Developed a centralized metadata repository system using DynamoDB, aggregating metadata from external sources, cross-AWS account tables, and other data sources. This system is leveraged by all data lake components for metadata validation and table processing, ensuring consistency and efficient data management.
  • Developed a data acquisition platform to ingest data from multiple sources into a data lake using AWS S3 for storage. Automated data acquisition from APIs, cross-AWS account transfers, and cloud sources like Google Analytics and Salesforce using Python. Leveraged AWS Lambda and Glue for serverless script execution, with PySpark used for data transformations
  • Led end-to-end implementation of multiple high-impact projects from requirements gathering through deployment and post-launch support stages.
  • Optimized data processing by implementing efficient ETL pipelines and streamlining database design.
  • Collaborated with cross-functional teams for seamless integration of data sources into the company''s data ecosystem.
  • Migrated legacy systems to modern big-data technologies, improving performance and scalability while minimizing business disruption.
  • Increased efficiency of data-driven decision making by creating user-friendly dashboards that enable quick access to key metrics.
  • Evaluated various tools, technologies, and best practices for potential adoption in the company''s data engineering processes.
  • Streamlined complex workflows by breaking them down into manageable components for easier implementation and maintenance.
  • Provided technical guidance and mentorship to junior team members, fostering a collaborative learning environment within the organization.
  • Automated routine tasks using Python scripts, increasing team productivity and reducing manual errors.
  • Documented and communicated database schemas using accepted notations.

Data Engineer

Cognizant Technologies Solutions
06.2018 - 10.2021
  • I worked across different resources like Teradata, PostgreSQL, AWS (Lambda, S3, EC2, CloudWatch), Databricks, and REST APIs for clients such as PepsiCo and Sanofi. Responsible for ETL processes, database design, automation, and data transformation using Teradata SQL, Oracle SQL, Python, and PySpark. Developed Python scripts for data processing, automated Databricks setups, and integrated REST APIs for seamless UI-database connectivity, ensuring data integrity and cloud security compliance.

Education

Master of Technology - Data Science

BITS
Pilani, India
08-2024

Bachelors of Technology - Electronics And Communication Engineering

Jalpaiguri Government Engineering College
Jalpaiguri
05.2018

Higher Secondary - Science

Shree Jain Vidyalaya
Kolkata
05.2014

Secondary -

Shree Jain Vidyalaya
Howrah
03.2012

Skills

  • Python
  • AWS
  • Spark
  • Hudi
  • Terraform
  • SQL
  • HIVE
  • REST API
  • GIT
  • Data Analytics

Certification


  • Microsoft:- Introduction to Python
  • Stanford University and DeepLearning.AI :- Supervised Machine Learning: Regression and Classification
  • Stanford University and DeepLearning.AI :- Advanced Learning Algorithms
  • AWS Solutions Architect Associate :- WIll be completed within 1 month

Additional Information

  • Participated in Internal hackathon projects in Quicken and got recognized by peers and the projects later got integrated as as full time mainstream project.
  • Have received recognition from Client and Leadership team of Cognizant for critical deliverables, consistency, innovative ideas and their implementation. Competed in Coding and Machine Learning challenges organized by my organization. Published Whitepapers which is internal to Cognizant, on some innovative works I've done.

Languages

English
Bilingual or Proficient (C2)
Hindi
Bilingual or Proficient (C2)
Bengali
Bilingual or Proficient (C2)

Timeline

Data Engineer

Quicken
11.2021 - Current

Data Engineer

Cognizant Technologies Solutions
06.2018 - 10.2021

Master of Technology - Data Science

BITS

Bachelors of Technology - Electronics And Communication Engineering

Jalpaiguri Government Engineering College

Higher Secondary - Science

Shree Jain Vidyalaya

Secondary -

Shree Jain Vidyalaya
Abhishek Ojha