
Data Engineer with 2 years of experience designing and supporting production-grade data processing platforms on AWS. Hands-on in building CSV-to-Parquet ETL pipelines using PySpark on EC2 and AWS Glue, integrating data from SFTP, APIs, and on-prem systems. Experienced in cloud-native orchestration with Step Functions and Lambda, managing Iceberg tables and RDS, and ensuring data reliability through monitoring, alerting, and SLA ownership. Strong background in Terraform-based infrastructure automation and cost-efficient, scalable data workflows.
Python
Version control and CI/CD
Git, GitLab CI/
CD, and AWS CI/
CD (AWS CodeCommit, CodeBuild, CodeDeploy, and CodePipeline)
Data engineering and analytics
Apache Spark, PySpark, Pandas, Big data processing, Data conversion, ETL pipelines, Data migration, etc.
Data Build Tool (DBT), Snowflake, Apache Iceberg.
Gen AI
Git Copilot (Claude Sonnet, GPT)
AWS Cloud
S3, EC2, Glue (ETL, Crawlers, Catalogs), RDS, Lambda, API Gateway, DynamoDB, Athena, CloudWatch, Route 53, IAM, SNS, SQS, AWS Transfer Family (SFTP), Step functions and more
Database
SQL Server, MySQL, and RDS (Aurora PostgreSQL, Aurora MySQL)
Infrastructure as code (IaC)
Terraform and Tofu
Apache Airflow