Summary
Overview
Work History
Education
Skills
Training and Certifications
Timeline
Generic

Govind Mishra

Senior Data Engineer
Ahmednagar

Summary

Senior Data Engineer with over 7 years of experience in managing large-scale data pipelines and handling 2-3 TB of daily data processing. My expertise spans across AWS technologies like Athena (Iceberg), MWAA, S3, Glue, EC2 and SQS, ensuring optimal data transformation and quality. I’ve developed and orchestrated custom Airflow DAGs and operators built a DSAR process for compliance, and implemented CI/CD pipelines using GitHub workflows and Terraform. My strong collaboration skills allow me to work effectively across time zones, leading projects while mentoring junior engineers. I bring a unique blend of technical proficiency, problem-solving, and leadership to deliver robust data solutions, aligning them with business goals to drive informed decision-making.

Overview

7
7
years of professional experience

Work History

Senior Data Engineer

Arkose Labs India Pvt. Ltd.
01.2024 - Current
  • Managed daily ingestion, transformation, and quality checks of 2-3 TB of data using Athena Data Lake (including Apache Iceberg), S3, AWS Glue, MWAA, EMR Serverless and EC2 to ensure efficient data processing and storage.
  • Developed and maintained Python scripts for data workflows, leveraging Apache Airflow for task orchestration. Designed and deployed multiple custom DAGs and Airflow operators for automating complex, scalable data pipelines.
  • Built a Data Subject Access Request (DSAR) process to ensure compliance with data privacy regulations, enabling efficient retrieval and processing of user data on demand.
  • Successfully implemented real-time and batch processing solutions using AWS SQS for asynchronous messaging and task coordination.
  • Led efforts to optimize data pipeline performance and ensure data quality through automated validations and robust quality checks.
  • Worked closely with cross-functional teams in various time zones, providing technical solutions and ensuring alignment with business objectives.
  • Mentored and provided technical guidance to junior engineers, supporting their task execution and fostering professional growth.
  • Integrated CI/CD pipelines with GitHub workflows for automated deployment and used Terraform for infrastructure as code, ensuring smooth, scalable, and consistent deployment processes.
  • Collaborated with data science, analytics, and product teams to align data initiatives with business needs and support data-driven decision-making.

Data Engineer

Yara Digital Farming India Pvt. Ltd.
10.2021 - 01.2024
  • Understand and implement data engineering best practices.
  • Improve, manage, and teach standards for code maintainability and performance in code submitted and reviewed.
  • Maintain data warehouse with timely and quality data.
  • Build and maintain data pipelines from internal databases and SaaS applications.
  • Create and maintain architecture and systems documentation.
  • Write maintainable, performant code.
  • Collaborate with other teams to drive efficiencies for their work and to ensure data needs are addressed.
  • Provide mentorship for junior team memebers to help them grow in their technical responsibilities

Senior Data Engineer

Enquero Global LLP
03.2021 - 10.2021
  • Understand the requirement coming from business and develop informatica workflow.
  • Write complex SQL queries to extract data from source and load them into file as per CDL template so that these files can be loaded into SAP Calidus.
  • Proper logging of status and metadata of each workflow run in table which will help support team in daily production activity.
  • Discuss issues and requirement with client in order to deliver expected solution.

Technical Associate

SEARS HOLDINGS INDIA
04.2019 - 03.2021
  • Responsibilities designed the mappings between sources (external files and databases) to operational staging targets.
  • Involved in understanding the business requirements and translate them to technical solutions.
  • Worked for preparing design documents and interacted with the data modelers and project lead to understand the data model and design. Experience with high volume datasets from sources MS SQL Server, GCP BQ, Hive and Flat Files. Developing ETL procedures and SQL queries to ensure conformity, compliance with standards and lack of redundancy, translates business rules and functionality requirements into Pentaho ETL jobs.
  • Developing ETL jobs in order to check correctness source data in data pipeline itself so that correct data will be loaded in database and incorrect records will be sent out to vendor.

Associate Engineer

SEARS HOLDINGS INDIA
06.2018 - 03.2019
  • Tuned SQL queries and views for correctness and performance.
  • Continuous analysis of existing solution to provide insight into possible improvements.
  • Perform testing and QA to ensure the accuracy of data.
  • Provide Production support and fulfillment of JIRA ticket requests for data issues, acting as the support person to resolve assigned JIRA tickets and provide correct data to business.
  • Worked with source databases of many flavors (MySQL, MS SQLserver, Hive,GCP BQ,Redshift etc) to access and expose necessary data.
  • Loaded data from various sources using different transformations like Source data validator, Database and Table Join, Aggregators, Connected & Unconnected lookups, Filters, Expression.

Intern

SEARS HOLDINGS INDIA
08.2017 - 05.2018
  • Learning on ETL, SQL and Data Warehouse concepts in order to get basic understanding required for ETL development.
  • Assist other ETL developers in their projects and develop less complex ETL process and write SQL queries.
  • Read and add new changes in existing project documentation.

Education

Bachelor of Engineering - Electrical Engineering

SAVITRIBAI PHULE PUNE UNIVERSITY
Pune, India
04.2001 -

Skills

  • Strong working knowledge of Data Warehouse and Data Lake skills, with an understanding of entities and relationships.
  • Excellent troubleshooting skills with demonstrable experience identifying bottlenecks in existing database/data pipeline processes.
  • Ability to write complex SQL and create DB objects like stored procedures.
  • Working experience of Python, Airflow and AWS services like AWS S3, Athena, Glue, EMR Serverless, DMS, Redshift, lambda etc.
  • Experience in working with ETL tools like Pentaho, Informatica etc.
  • Experience in extracting data from multiple APIs, SharePoint and other data sources etc.
  • Strong written and verbal communication skills. Analytical and problem solving skills.
  • Understanding of Customer Data Platform tool like segment and marketing tools like CleverTap, ActiveCampaign etc.
  • Experience in automated builds and deployments using CICD tool like GitHub.
  • Basic knowledge of PySpark and Kafka.

Training and Certifications

  • Data Engineering Nanodegree, Udacity, 2021
  • Airflow Fundamentals, Astronomer, 2021
  • DBT Fundamentals, DBT Labs, 2023
  • AWS Certified Solutions Architect – Associate, Amazon Web Services (AWS), 2020
  • AWS Certified Cloud Practitioner, Amazon Web Services (AWS), 2020

Timeline

Senior Data Engineer

Arkose Labs India Pvt. Ltd.
01.2024 - Current

Data Engineer

Yara Digital Farming India Pvt. Ltd.
10.2021 - 01.2024

Senior Data Engineer

Enquero Global LLP
03.2021 - 10.2021

Technical Associate

SEARS HOLDINGS INDIA
04.2019 - 03.2021

Associate Engineer

SEARS HOLDINGS INDIA
06.2018 - 03.2019

Intern

SEARS HOLDINGS INDIA
08.2017 - 05.2018

Bachelor of Engineering - Electrical Engineering

SAVITRIBAI PHULE PUNE UNIVERSITY
04.2001 -
Govind MishraSenior Data Engineer