Summary
Overview
Work History
Education
Skills
Timeline
Generic

SAMEER SURYAWANSHI

Chennai

Summary

Data Engineer with 2.5+ years of experience designing and building scalable data pipelines and Lakehouse architectures on cloud platforms. Hands-on experience with Spark, SQL and Delta Lake for processing transactional, forecasting and analytics data at scale. Strong focus on data reliability, performance optimization, and enabling business reporting through curated datasets.

Overview

2
2
years of professional experience

Work History

Data Engineer

Dexian
Chennai
07.2023 - 12.2025
  • Built a cloud-based analytics platform for the Bihar Government's Department of Agriculture, consolidating data from multiple source systems and processing daily transactional and forecast data to deliver analytics-ready data used for decision-making across crop coverage, subsidies, targets and rainfall forecasting dashboards.
  • Designed and maintained batch data ingestion pipelines using Azure Data Factory and Azure Databricks, ingesting data from relational databases, flat files and REST APIs with support for scheduling, retries and monitoring.
  • Implemented a Medallion-based Lakehouse architecture (Bronze, Silver, Gold) using Delta Lake to separate raw, refined and analytics-ready datasets, enabling reliable reprocessing and data quality management.
  • Performed large-scale data transformations using PySpark and Spark SQL, including joins, aggregations, standardization and enrichment of transactional and forecast datasets.
  • Delivered curated Gold-layer datasets optimized for Power BI dashboards, enabling reporting and KPI tracking for government stakeholders.
  • Worked closely with analytics and reporting teams to ensure data accuracy, freshness, and usability for Power BI dashboards.
  • Optimized Spark workloads using partitioning strategies, broadcast joins for reference data and data layout optimizations, improving reporting query performance.
  • Managed Delta Lake maintenance operations including VACUUM and table optimization to control storage growth and ensure consistent performance.
  • Supported pipeline monitoring, failure handling, and performance tuning to ensure timely and reliable availability of analytics data.
  • Designed scalable data pipelines, boosting efficiency by 30%.
  • Implemented ETL processes, reducing data processing time by 40%.
  • Optimized database performance, enhancing query speed by 50%.
  • Developed data models that improved reporting accuracy by 25%.
  • Automated data workflows, saving 20 hours of manual work weekly.
  • Led data quality initiatives, increasing data reliability by 35%.
  • Utilized cloud technologies, cutting infrastructure costs by 15%.

Education

M.Tech. - Industrial Automation

National Institute of Technology
Trichy
01.2023

B.E. - Mechanical Engineering

Yeshwantrao Chavan College of Engineering
Nagpur
01.2019

Skills

  • Python
  • SQL
  • PySpark
  • Apache Spark
  • Delta Lake
  • Lakehouse Architecture
  • Azure Databricks
  • Azure Data Factory
  • ADLS
  • Azure SQL DB
  • ETL processes
  • CI/CD

Timeline

Data Engineer

Dexian
07.2023 - 12.2025

M.Tech. - Industrial Automation

National Institute of Technology

B.E. - Mechanical Engineering

Yeshwantrao Chavan College of Engineering
SAMEER SURYAWANSHI