Summary
Overview
Work History
Education
Skills
Timeline
Generic

Rajat Choudhary

Summary

Results-driven Data Engineer with hands-on experience in designing and managing ETL pipelines using Python, Pandas, PySpark, Databricks, and AWS Glue. Proficient in SQL and building custom UIs with FastAPI, React.js, and streamlining data-centric processes. Skilled in integrating data with back-end systems and API-driven platforms, leveraging Redis for caching and PostgreSQL for relational data storage. Experienced with AWS services such as S3, EC2, Athena, EMR, Glue, and Lambda to build, manage, and optimize scalable cloud solutions. Knowledgeable in GenAI tools, LangGraph, and large language models (LLM) for intelligent data processing and automation.

Overview

4
4
years of professional experience

Work History

Data Engineer

Agilisium Consulting
Chennai
07.2025 - Current

Project: mAI Access GVD Creation.

  • Designing and maintaining ETL pipelines for clinical and operational data using Python, AWS Textract, and Vector DB.
  • Implementing FastAPI for backend APIs and integrating LLM-based GenAI tools to automate data insights.
  • Leveraging LangGraph for intelligent workflow automation and seamless integration with reporting dashboards.
  • Ensuring data quality, performance optimization, and timely delivery of analytics to support healthcare decision-making.

Data Engineer

Jean Martin System
Chennai
07.2021 - 06.2025

Project: Event-Driven Data Pipeline

  • Implemented an event-driven data pipeline where incoming data in Amazon S3 triggers AWS Lambda functions for validation and preprocessing.
  • Processed validated data using AWS Glue with PySpark, applying business rules, transformations, and data quality checks.
  • Leveraged Amazon Athena for querying processed data and generating analytics for downstream systems.
  • Automated data workflows to ensure seamless ingestion, processing, and storage for analytics.
  • Improved scalability and efficiency by leveraging serverless architecture and cloud-native services.

Projects: Viacom Data Processing (Batch Data)

  • Developed a scalable ETL system using PySpark to extract data from Amazon S3 and run transformations on a cluster powered by EMR.
  • Handled data customization processes (modifying, filtering, and transforming data) to align with client-specific requirements.
  • Optimized batch processing to handle large datasets efficiently, improving data processing speeds.

Projects: XML to CSV Transformation & Data Pipeline

  • Built an end-to-end data processing pipeline that converts XML data to CSV using Python and Pandas.
  • Automated data ingestion into PostgreSQL and Amazon S3, ensuring efficient storage and retrieval for further analysis.
  • Collaborated across teams to provide clean and structured data pipelines for reporting and analytics.

Projects: Web Scraping & Data Analysis

  • Employed BeautifulSoup, Selenium, Requests, and Pandas to extract detailed data from LinkedIn, including roles, company URLs, and technologies.
  • Created actionable insights from the scraped data to support strategic decision-making and analysis.
  • Automated scraping workflows with Selenium to bypass dynamic website challenges.

Education

MCA -

ABES Engineering College

BCA -

SD College Of Management And Studies

Skills

Programming language: Python

Framework: Django REST Framework, FastAPI

Libraries: Pandas, NumPy, Selenium, and BeautifulSoup

AWS Services: IAM, S3, EC2, Lambda, EMR, Glue, Athena

Databases/Servers: SQL and PostgreSQL

Big Data Ecosystem: Spark/Pyspark

GenAI & LLM Tools: LangGraph, GPT/LLM integration Version Control: Git, GitHub

Containerisation: Docker

Timeline

Data Engineer

Agilisium Consulting
07.2025 - Current

Data Engineer

Jean Martin System
07.2021 - 06.2025

MCA -

ABES Engineering College

BCA -

SD College Of Management And Studies
Rajat Choudhary