Summary
Overview
Work History
Education
Skills
Interests
Work Availability
Timeline
Generic

ABHISHEK SINGH

Lead Data Engineer
Stockholm

Summary

Experienced Lead Data Engineer adept at building and scaling data-intensive apps. Skilled in team leadership, innovation, and robust solution implementation using Python, Spark, Hadoop, GCP, AWS, Cassandra, Docker, and Data Visualizations. Recognized for establishing BI ecosystems, fostering data-driven cultures, and driving strategic initiatives for enhanced analytics and customer-focused platforms.

Overview

10
10
years of professional experience
2
2
years of post-secondary education
3
3
Languages

Work History

Lead Data Engineer

Quinyx AB
Stockholm
09.2020 - Current

Building and structuring the entire data pipeline for the organization to enable to more data-driven decisions.

Lead Data Developer

Dunnhumby
Gurgaon
04.2019 - 10.2020

Highlights :

  • Core maintainer of an in-house big data framework
  • Responsible for building docker containers to isolate local environment for developers
  • Configured CI/CD pipelines to utilize Docker containers
  • Delivered the 1st Customer focused Analytical solution to one of the biggest fast food companies in the world
  • Part of the SME group involved in delivering training's across the organization

Projects :

  • Mercury - Active contributor & maintainer of an in house PySpark framework utilized by Analysts and Data Scientists to deliver best insights across the organization.Constantly involved in discussions with Analysts,Engineers and stakeholders to enhance by integrating new technical adoptions like utilizing docker based containers for supporting various Python & PySpark versions.Built and augmented functionalities like Data lake,transformation & segmentation engine built on Hive,Airflow,GCP/HDFS,Docker & GIT.
  • Local Environment - Segregated & Isolated various development services by utilizing docker containers.Involved creation of docker network bridges for all the services such as airflow,grafana and spark to communicate to each other,giving a seamless production level experience to developers.
  • ADS - Built the first customer based analytical data solution for one of the biggest fast food companies in the world.Involved in on-site discussions and requirement gathering directly from the client.Evaluating the right tools such as AWS EMR and S3 to build the solution on an already existing Redshift and Talend based sources.Writing the entire ETL in Talend and providing python based classes as an abstraction on the EMR Hive tables for the end users to run the ML and classification models.
  • Pipeline improvements - Earlier implementation utilized core bash and specific versions of a libraries for running various CI/CD based tasks.Isolated the CI/CD steps to use docker,leading to no dependencies on specific versions of python or related libraries.

Senior Data Developer

Dunnhumby
Gurgaon
05.2017 - 03.2019

Highlights :

  • Built the first data monitoring using grafana and influxdb
  • Contributed in writing the data ingestion tool used to store raw files
  • Developed, updated and organized databases to handle customer data
  • Responsible for building the data solution for clients across NA & European markets & migration from on-prem servers to cloud

Projects :

  • Quality Dashboard - Utilized influxdb as a back end for visualizing data publish metrics on grafana. Every data solution fed data to influxdb as time series allowing grafana to visualize and alert engineers in case of anomalies in the data.Extended to provide various on the spot customer focused KPIs for the ingested data.
  • Ingest tool - Worked with the core team to create self service tool to store raw files into Hadoop based data stores.The tool provided the functionality to provide the schema,read from zipped files,automatically partition data based on the timestamp attached on the file & schema evolution options.
  • Utilizing Git and other collaboration tools to work in a team spread globally,handling merge requests as a master reviewer.
  • Data solution: Developing the ETL pipeline to ensure timely delivery of data solutions across markets spread among the worlds top most retailers. Created classes & modules in Python by using PySpark and SQL based transformations to apply the business logic and required mapping.

Software Engineer

Inrhythm Solutions Pvt Ltd
Hyderabad
06.2013 - 04.2017
  • Proof of Concept projects like OSD using Hadoop, HDFS, Map Reduce, Spark, Hive, HBase & Cassandra
  • Involved in the building the ETL process using various Teradata utilities and tuning under performing scenarios
  • Explored mobile friendly web applications built on PHP , jQuery , JS and AJAX out of self interest

Education

Masters in IT Project Management -

Stockholm University
Stockholm, Sweden
08.2021 - Current

Bachelor of Technology - Computer Science

St. Peter's Engineering College
Hyderabad, India
08.2009 - 2013.05

Skills

Spark/Hadoopundefined

Interests

Trekking
Biking

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Masters in IT Project Management -

Stockholm University
08.2021 - Current

Lead Data Engineer

Quinyx AB
09.2020 - Current

Lead Data Developer

Dunnhumby
04.2019 - 10.2020

Senior Data Developer

Dunnhumby
05.2017 - 03.2019

Software Engineer

Inrhythm Solutions Pvt Ltd
06.2013 - 04.2017

Bachelor of Technology - Computer Science

St. Peter's Engineering College
08.2009 - 2013.05
ABHISHEK SINGHLead Data Engineer