Summary
Overview
Work History
Education
Skills
Certification
PROJECTS
Timeline
SoftwareEngineer

ABHAS UPADHAYAY

Software Engineer
Dehradun,Uttarakhand

Summary

Results-driven Software Engineer with over 4 years experience in building large scale distributed computing systems. Recognized consistently for performance excellency and understanding complex business requirements while helping companies to collect, collate and exploit digital assets.

Overview

5
5
years of professional experience
6
6
years of post-secondary education
4
4
Certifications

Work History

Senior Software Engineer

Spoonshot
Bangalore, Karnataka
08.2021 - Current
  • Maintained kafka cluster in kubernetes. Solved problems like kafka rebalancing and consumer timeouts which increased throughput for 7 pipelines across 3 teams.
  • Used Pyspark, Unix, Kafka, Airflow, MySQL, Redis to ingest and process millions of data per day in streaming and batch pipelines.
  • Maintained SolrCloud cluster in kubernetes. Conducted experiments to scale SolrCloud for our use case and reduced p95 query time from 1s to 75ms. Rewrote legacy pipeline and used solr instead of spark for searching which reduced turnaround time from 6 hours to 5 minutes.
  • Reviewed hardware infrastructure planning and setup production ElasticSearch cluster based on our requirements and data growth.
  • Extensively used MongoDb, InfluxDb and Prometheus to store and query application metrics in our micro services.
  • Created scalable data pipelines and api endpoints for client facing feature ( Concept-Generator) for our website using pyspark, fast-api framework, Solr and ElasticSearch which helped us close a 91k USD deal with Pepsi.
  • Debugged issues with spark pipelines using Spark-UI metrics and profilers (for e.g streaminglens) and resolved them which reduced batch processing time of our pipeline from 2 hours to 12 minutes.
  • Constantly reviewed latest open source research and best practices in Data engineering. Conducted POC on Delta Lake to create Enterprise Data Warehouse.
  • Built self-serve framework for data scientists to deploy ML models in kubernetes hosted in Azure.

Data Engineer

Target Corporation India Ltd
Bengaluru, Karnataka
07.2018 - 07.2021
  • Designed software architecture for large scale data pipelines and developed using Kafka, Spark(Scala), Sqoop, PostgreSQL, Hive, Scripts, REST-API for storing and processing structured, semi-structured and unstructured data while following best practices throughout Software Development Life Cycle.
  • Fixed Hadoop small files issue faced by Data analysts and enhanced job performance by 30 mins along with reduced memory usage.
  • Proactively managed pipelines in production system with great emphasis on SLA.
  • Created data quality alerts, data lineage, BI dashboards using unix shell scripting, spark, grafana and homegrown tools for application monitoring and data governance.
  • Created scalable pipelines using Spark(Scala) which enabled analytics on target's financial assets data for the first time and helped serve 5 critical finance reports in the same quarter.
  • Automated processes in stores using data analysis skills and Spark(Scala) which saved developer's effort by 2 hours and enhanced visibility over stock movement between stores.

Software Engineer Intern

Nvidia Graphics Pvt. Ltd
Pune, Maharashtra
08.2017 - 12.2017
  • Modified legacy code for Nvidia GeForce Experience app and achieved reduction in installation time of by 3 minutes.
  • Built python script for Driver Validation System to validate certificate of any package before installation.
  • Helped to identify functionality issues in Nvidia's display driver dashboard.

Education

B.E(Hons) - Electronics And Instrumentation

Bits Pilani University
Goa, CGPA: 7.46/10
07.2014 - 2018.07

Sr. Secondary School - Math/Science

Seven Oaks School
Dehradun, Percentage : 91.75
01.2012 - 2013.01

Secondary School -

Seven Oaks School
Dehradun, Percentage : 94.6
01.2011 - 2012.01

Skills

Data Warehousing and ETL

undefined

Certification

Big Data with Hadoop and Spark, CloudxLab

PROJECTS

PROJECT-1: Concept Generator

  • Built a common framework using Pyspark and Airflow which helped Data scientists to index their data to Apache Solr/Elastic Search easily and independenlty.
  • Built client facing APIs for our product using fastapi to query and combine data from MySQL, Elastic Search, Solr and power the concept generator feature which was demoed in Future Food-Tech event,London and received great appreciation. Used best practices for sharding, replication to scale the APIs to handle traffic.

PROJECT -2: Finance Fixed Assets in Hadoop3

  • Built framework using Akka library in Spark(Scala) to ingest data from REST-API.
  • Built data pipeline to store and process 500million+ rows of semi-structured,structured and unstructured data using Spark, Hive, Apache Oozie, GitHub, Unix Shell Scripting, Kafka, Grafana, PostgreSQL to support 5 finance reports.

PROJECT-3: Inventory Sub Locations core history in Hadoop

  • Build data pipeline to store and process 200million+ rows of JSON data using Spark structured streaming to give latest state of store's stock inventory after every 6 hours.

PROJECT-4 ( Self-project): BankCustomer Churn Prediction

  • Performed univariate, bivariate analysis on the input dataset and cleansed it using Pandas. Trained the Random Forest model on cleansed data and predicted churn propensity with 90% accuracy.

Timeline

Basics of Python with Data Structures and Algorithms

08-2022

Senior Software Engineer

Spoonshot
08.2021 - Current

Big Data with Hadoop and Spark, CloudxLab

02-2021

Data Warehouse Fundamentals for Beginners, Udemy

10-2020

Machine Learning, Internshala

08-2020

Data Engineer

Target Corporation India Ltd
07.2018 - 07.2021

Software Engineer Intern

Nvidia Graphics Pvt. Ltd
08.2017 - 12.2017

B.E(Hons) - Electronics And Instrumentation

Bits Pilani University
07.2014 - 2018.07

Sr. Secondary School - Math/Science

Seven Oaks School
01.2012 - 2013.01

Secondary School -

Seven Oaks School
01.2011 - 2012.01
ABHAS UPADHAYAYSoftware Engineer