Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Projects
Timeline
Generic

Ashish More

Chhatrapati Sambhajinagar

Summary

Dynamic Data Engineer with a proven track record at ZS, specializing in building scalable ETL pipelines and cloud-based data solutions. Proficient in Python and AWS, I excel in optimizing data workflows and enhancing data quality. A collaborative team player, I drive impactful results through innovative problem-solving and advanced analytics.

Overview

1
1
Certification

Work History

Data Engineer

ZS
Chhatrapati Sambhajinagar
  • Developed and maintained cloud-based data processing solutions using Python, PySpark, and Databricks, ensuring efficient handling of large-scale data workloads.
  • Participated in all SDLC phases—design, development, testing, and deployment of AWS-based applications.
  • Designed and implemented scalable data pipelines and Data Lake architectures using AWS S3, Glue, Lambda, Step Functions, and Spark/PySpark.
  • Integrated AWS SNS, SQS, and Redshift to build event-driven and real-time data workflows.
  • Created Amazon QuickSight dashboards enabling business stakeholders to derive actionable insights.
  • Managed high-volume data storage and retrieval processes using DynamoDB with optimized performance tuning.
  • Monitored and troubleshot applications using CloudWatch logs, metrics, and automated alerts.
  • Utilized Git for version control, branching, code reviews, and GitHub-based CI/CD automation.
  • Implemented secure IAM roles, policies, and cross-service access controls.
  • Developed robust ETL pipelines using AWS Glue (PySpark) to process and transform batch datasets from multiple sources.
  • Designed a multi-layered Data Lake (Raw, Curated, Transformed) ensuring governance, reusability, and data quality.
  • Implemented Glue Crawlers and Catalog Tables for automated schema detection and metadata management.
  • Utilized Amazon Athena for ad-hoc SQL validation, profiling, and performance checks.
  • Integrated Snowflake as the enterprise data warehouse with optimized COPY/Snowpipe-based ingestion strategies.
  • Automated infrastructure provisioning using AWS CloudFormation for reproducible and consistent deployments.
  • Built CI/CD pipelines using GitHub Actions to enable automated code deployment across environments.
  • Collaborated with cross-functional teams to optimize workflows, enhance system reliability, and reduce processing times.
  • Integrated Azure Data Factory (ADF) as the scheduler/orchestrator to manage cross-cloud workflows across AWS DMS, Glue, Databricks, S3, and Snowflake.
  • Performed advanced data processing in Databricks, including Delta Lake optimization, SCD2 implementation, complex joins, and large-scale Spark transformations.
  • Developed AI-driven automation and data intelligence components using LangChain for LLM orchestration, prompt pipelines, and autonomous agents.
  • Utilized HuggingFace transformer models for NLP tasks such as sentiment analysis, classification, summarization, and entity extraction within ETL workflows.

Data Engineer

CLAIRVOYANT
  • Migration of data from various sources to Amazon Redshift, ensuring data integrity, completeness, and accuracy.
  • Utilized AWS Glue to design and implement ETL pipelines for transforming and loading data into Redshift.
  • Employed Terraform as Infrastructure-as-Code to provision and manage AWS resources efficiently and consistently.
  • Leveraged DBeaver for managing and documenting source metadata, improving data discovery and governance.
  • Implemented Apache Airflow to orchestrate end-to-end data pipelines, ensuring timely and reliable data processing.
  • Designed and built a Delta Lake architecture on Amazon S3 to enhance storage optimization, ACID transactions, and query performance.
  • Monitored and optimized ETL processes, SQL queries, and workflows for performance, scalability, and cost efficiency.
  • Worked closely with cross-functional teams to align project goals, troubleshoot issues, and enhance data workflows.
  • Integrated Azure Data Factory (ADF) as an additional orchestration layer for hybrid cloud pipelines involving AWS and on-prem systems.
  • Used Azure Databricks for scalable data processing, Delta Lake transformations, and advanced analytics workloads.
  • Leveraged Azure Data Lake Storage (ADLS) for staging, archival, and cross-cloud data sharing with AWS-based pipelines.
  • Implemented Azure Monitor and Log Analytics for cross-cloud pipeline monitoring, alerting, and proactive health checks.

Data Engineer

TAVANT
  • Data migration from source SQL Server to Snowflake using optimized ETL workflows.
  • Migrated data from various file formats and RDS to Snowflake with complete validation and reconciliation.
  • Performed complex data transformations as per ETL specifications and business logic.
  • Experienced in writing PySpark scripts for large-scale data extraction and processing.
  • Strong knowledge of Spark architecture, including Spark Core, Spark SQL, and optimized transformation strategies.
  • Built high-performance data pipelines using AWS S3, Glue, PySpark, Athena, and Snowflake.
  • Developed batch-processing solutions to process structured and unstructured data efficiently.
  • Hands-on experience in data ingestion, transformation, performance tuning, and query optimization.
  • Applied data-cleaning techniques to remove inaccurate, duplicate, or corrupted records, significantly improving data quality.
  • Worked extensively with Parquet file formats and compression techniques to optimize storage and performance.
  • Filtered, aggregated, and enriched data to deliver meaningful insights to stakeholders for analytics and decision-making.
  • Strong understanding of upstream source systems and business requirements for accurate data mapping.
  • Experienced with Apache Airflow for end-to-end pipeline orchestration and monitoring.
  • Familiar with Spark APIs including PySpark and Spark SQL for distributed processing.
  • Experienced with Databricks for scalable Spark processing, Delta Lake implementation, SCD handling, notebook-based ETL development, and job orchestration.
  • Environment: PySpark, Databricks, AWS (S3, Glue, Athena, RDS), SparkSQL, DMS, Snowflake.

Data Engineer

VERISK ANALYTICS
  • Developed PySpark programs and created the data frames and worked on transformations.
  • Understanding the up-stream source nature and work with business cases.
  • Engage with BA to understand the requirement clearly and comes up with missing scenarios and get them clarified with BA.
  • Involved in working on Spark SQL Code as an alternative approach for Faster Data Processing and better Performance.
  • Experience in data ingestion, transformation and performance tuning.
  • Transferring data from RDBMS to HDFS and Hive tables by making use of PySpark.
  • Injected and processed large amount of data from various structured and semi-structured sources into HDFS (AWS Cloud).
  • Worked on QA support activities, test data creation and Unit testing activities.
  • Actively participated in daily stand-up, biweekly SCRUM and project meetings.

Associate Support Data Engineer

CFA
  • Design, create, and manage scalable ETL (extract, transform, load) systems and pipelines for various data sources.
  • Manage, improve, and maintain existing pipeline.
  • Optimize and improve existing data quality and data governance processes to improve performance and stability.

Education

Some College (No Degree) - Major

MGM JAWAHARLAL NEHRU COLLEGE OF ENGINEERING
AURANGABAD

Skills

  • Python
  • SQL
  • Java
  • Scala
  • Pyspark
  • AWS
  • Azure data factory
  • HDFS
  • Spark
  • Airflow
  • Kafka
  • Glue
  • Athena
  • Snowflake
  • Databricks
  • Lakehouse
  • Spark-Core
  • Dataframes
  • Spark SQL
  • Jira
  • Git
  • Github
  • Docker
  • GitAction
  • Langchain
  • Huggingface
  • Generative AI
  • Azure data factory
  • Delta lake

Certification

Problem solving Certifications from HackerRank

Accomplishments

Winner and Runner-up in multiple inter college Coding Competitions.

Projects

NETLINK (ZS), Developed and maintained cloud-based data processing solutions using Python and PySpark., Involved in multiple stages of the software development lifecycle (SDLC)., Designed and implemented data pipelines and Data Lake architectures leveraging AWS services., Created QuickSight dashboards for data visualization and analytics., Managed data storage and retrieval processes using DynamoDB. EXL (CLAIRVOYANT), Migration of data from various sources to Amazon Redshift., Utilized AWS Glue to design and implement ETL processes., Employed Terraform for infrastructure as code., Implemented Apache Airflow to orchestrate data pipelines. TAVANT, Data Migration from SQL Server to Snowflake., Experience in writing PySpark scripts for data extraction., Developed batch processing and integrated solutions. RENOUS INTRANET (VERISK ANALYTICS), Developed PySpark programs and created data frames., Engaged with BA to understand requirements., Involved in working on Spark SQL Code for faster data processing. Life Circle (CFA: Associate Support Data Engineer), Design, create, and manage scalable ETL systems and pipelines., Optimize and improve existing data quality and governance processes.

Timeline

Data Engineer

ZS

Data Engineer

CLAIRVOYANT

Data Engineer

TAVANT

Data Engineer

VERISK ANALYTICS

Associate Support Data Engineer

CFA

Some College (No Degree) - Major

MGM JAWAHARLAL NEHRU COLLEGE OF ENGINEERING
Ashish More