Results-driven Data Engineer with hands-on experience in designing and managing ETL pipelines using Python, Pandas, PySpark, Databricks, and AWS Glue. Proficient in SQL and building custom UIs with FastAPI, React.js, and streamlining data-centric processes. Skilled in integrating data with back-end systems and API-driven platforms, leveraging Redis for caching and PostgreSQL for relational data storage. Experienced with AWS services such as S3, EC2, Athena, EMR, Glue, and Lambda to build, manage, and optimize scalable cloud solutions. Knowledgeable in GenAI tools, LangGraph, and large language models (LLM) for intelligent data processing and automation.
Project: mAI Access GVD Creation.
Project: Event-Driven Data Pipeline
Projects: Viacom Data Processing (Batch Data)
Projects: XML to CSV Transformation & Data Pipeline
Projects: Web Scraping & Data Analysis
Programming language: Python
Framework: Django REST Framework, FastAPI
Libraries: Pandas, NumPy, Selenium, and BeautifulSoup
AWS Services: IAM, S3, EC2, Lambda, EMR, Glue, Athena
Databases/Servers: SQL and PostgreSQL
Big Data Ecosystem: Spark/Pyspark
GenAI & LLM Tools: LangGraph, GPT/LLM integration Version Control: Git, GitHub
Containerisation: Docker