Summary
Overview
Work History
Education
Skills
Affiliations
Languages
Certification
References
Timeline
Generic
Kritika Pareek

Kritika Pareek

Bengaluru

Summary

Accomplished Data Engineer with a proven track record at Adobe, adept in designing scalable data solutions using Python, PySpark, and AWS technologies. Spearheaded initiatives improving data accuracy and operational efficiency, showcasing strong leadership and technical expertise. Demonstrated excellence in delivering customer-centric solutions and driving significant enhancements in data processing and analytics.

Overview

7
7
years of professional experience
1
1
Certification

Work History

Data Engineer

Adobe
Bengaluru
10.2021 - Current
  • Designed and implemented a Kafka-based data pipeline to onboard data into the platform, ensuring high scalability and reliability.
  • Led and executed an AWS Kinesis initiative to overcome platform limitations, enabling seamless data transfer between two Adobe platforms.
  • Architected and delivered custom data solutions for various enterprise customers, addressing unique business requirements, and driving operational efficiency.
  • Developed a custom integration to export data from the Adobe Platform to Google Campaign Manager 360, enhancing marketing attribution and campaign effectiveness.
  • Automated AWS Lambda-based data validation, quality checks, and profiling analysis, improving data accuracy and compliance for customers.
  • Presented and successfully implemented the above solutions for customers, ensuring seamless adoption, and measurable impact.
  • Authored internal white papers detailing the architecture and implementation of the custom Adobe-Google Campaign Manager 360 integration, and the Kafka-based data pipeline, providing technical guidance for future scalability.

Associate Consultant

Saama Technologies
Pune
03.2020 - 10.2021
  • Developed a generic XML parser in PySpark to process XML scripts and extract the required data efficiently.
  • Built PySpark-based web scrapers to collect XML data from various sources, transforming and storing the extracted data in Amazon Redshift tables.
  • Designed and implemented PySpark workflows to fetch data from Redshift, convert it to ORC format, and store it in Amazon S3, following a structured folder hierarchy.
  • Engineered and executed parallel ETL pipelines, integrating complex SQL queries, shell scripts, and PySpark scripts within XML-driven workflows.
  • Developed and maintained ETL workflows, handling data population, Change Data Capture (CDC), and seamless data movement between Amazon Redshift and other storage systems.
  • Extensive hands-on experience with Amazon Redshift, including SQL queries, stored procedures, and performance optimization for large-scale data processing.

SOFTWARE DEVELOPER

ADMAXIM INDIA PVT LTD
Bengaluru
06.2018 - 03.2020
  • Designed, developed, and deployed a website/mobile app classification model, including data extraction and pre-processing using Python scripting.
  • Built and deployed a user demographics classification model (age, gender, income, etc.). Leveraging PySpark and Python, optimizing data extraction, and preprocessing workflows.
  • Led the production deployment of a data aggregation pipeline using Apache Spark (PySpark) for large-scale data processing.
  • Successfully implemented Apache Druid (OLAP DB) in production, integrating it with Apache Kafka to handle real-time data ingestion, as per client requirements.
  • Developed a client-facing data visualization interface using Apache Superset, enabling insightful analytics and reporting.
  • Strong hands-on experience with Python (advanced level), Hive, and AWS services, optimizing data workflows for performance and scalability.
  • Optimized Druid query performance, improving execution times for both JSON and SQL-based queries, to enhance analytical efficiency.

Education

Post Graduate Diploma - Big Data Analytics

C-DAC ACTS
Pune
02-2018

Bachelor of Science - B.E (Computer Science & Engineering)

Pune University
V.A.C.O.E Ahmednagar
05-2017

Skills

  • Python, PySpark, SQL
  • AWS - Glue, Redshift, EMR, Lambda, MSK, Kinesis
  • Kafka
  • Docker, CI/CD, Airflow
  • Adobe - Adobe Experience Platform

Affiliations

  • Fitness enthusiast
  • Reading

Languages

Hindi
First Language
English
Proficient (C2)
C2

Certification

  • AI & Mlops (Indian Institute of science)
  • Azure Data Engineer Associate

References

References available upon request.

Timeline

Data Engineer

Adobe
10.2021 - Current

Associate Consultant

Saama Technologies
03.2020 - 10.2021

SOFTWARE DEVELOPER

ADMAXIM INDIA PVT LTD
06.2018 - 03.2020

Post Graduate Diploma - Big Data Analytics

C-DAC ACTS

Bachelor of Science - B.E (Computer Science & Engineering)

Pune University
Kritika Pareek