Summary
Overview
Work History
Education
Skills
PREVIOUS EXPERIENCE
DECLARATION
Timeline
Generic

Prashant Kumar

Big Data Architect
Bangalore

Summary

Dynamic big data developer with over 12 years of experience in big data engineering, generative AI, machine learning, and cloud platforms, specializing in Azure Databricks and PySpark. Proven ability to design and implement end-to-end data architectures, Delta Lake pipelines, and robust data governance frameworks to ensure optimal performance and compliance.

Overview

14
14
years of professional experience

Work History

Solution Architect

Sigmoid
03.2023 - Current
  • Led a team of 12 data engineers and analysts in the design, development, and maintenance of data solutions on the Azure cloud platform.
  • Designed and implemented data pipelines using Azure Databricks, PySpark, Power BI for large-scale data processing, analysis, and reporting.
  • Streamlined data extraction, transformation, and loading (ETL) processes, resulting in a 40% reduction in data processing time.
  • Collaborated with cross-functional teams to gather requirements, create data models, and ensure data quality and consistency.
  • Implemented data security and compliance measures to protect sensitive data and meet regulatory requirements.
  • Developed and maintained Azure Data Factory (ADF) pipelines to orchestrate and automate data workflows.
  • Provided technical leadership and mentorship to team members, fostering professional growth and skill development.
  • Environment: Spark, DataBricks, Azure, Python, SQL, Salesforce, Google Analytics, GenAI

Senior Data Engineer

INTUIT
11.2021 - 03.2023
  • Implemented data pipelines using various technologies such as Apache Spark, Hadoop, and Airflow to process large volumes (up to 20PB) of data efficiently.
  • Creating data models and Design for Data Mart
  • Handling complex json structures.
  • Data Profiling and cleaning of data received from various sources
  • Responsible for managing high volume data on the warehouse, coming from various sources with proper data audit trail history
  • Leading offshore teams on technical ground and providing technical solutions to them.
  • Working in an agile environment and fully adhere to agile process
  • Environment: S3, AWS EMR, Databricks, PySpark, Hive, Airflow

Sr. Software Engineer

Impetus
01.2019 - 11.2021
  • Performed Performance benchmarking on parquet & Carbon Data file format on a scale of 10TB data on TPC-DS 99 queries
  • Worked on Storage & Query Abstraction layer.
  • Generated 10 tb of data on local & transferred to hdfs & s3 for performance testing using TPC-DS tool
  • Applied & tested various performance tuning parameter on Carbon Data & Spark
  • Developed spark job to run TPC-DS 99 queries on cluster
  • Analyzed & tune the query plans
  • Generated Data Cubes for faster access of data
  • Environment: Spark Core, Spark Sql, Unix, Hadoop, scala, Carbon Data, TPC-DS Tool

Applications Development Specialist 2

IQVIA
10.2019 - 11.2021
  • Working as a senior developer to implement a data lake (data warehouse) application using spark and Scala.
  • Creating complex reusable and robust data pipelines using Spark, Scala and writing complex hive and spark SQL queries to support data pipelines.
  • Responsible for managing high volume data on the warehouse, coming from various sources with proper data audit trail history.
  • Leading offshore team on technical ground and providing technical solutions to them.
  • Working in an agile environment and fully adhere to agile process
  • Environment: Spark, Scala, Hive, Impala, Kafka, Airflow, Unix Scripting, Python, SQL
  • Architecture: Cloudera Hadoop distributed cluster computing.

Senior Developer

Infosys Ltd
02.2014 - 05.2018
  • Wrote jobs for data ingestion from Amazon S3 into HDFS
  • Performed various validations and ETL transformation on the data
  • Made subsets form those Trillions of records, and made various models as per client requirement
  • Developed Hive Script with UDFs in Java
  • Optimized code for better performance and removal of errors
  • Requirement elicitation
  • Environment: Spark Core, Spark Sql, HIVE, Scala, Core Java, Amazon S3, HDFS, Oracle 10g, Unix Scripting

Intern

HPES
06.2012 - 08.2012
  • Worked on various POC using java.

Education

Bachelor of Engineering - Information Technology

IITM Gwalior (RGTU)

Skills

Azure Databricks

PREVIOUS EXPERIENCE

Senor Software Developer  Infosys Limited 02/2014 - 06/2018

Intern HP 06/2012 - 08/2012

DECLARATION

I hereby declare that the above written particulars are true to the best of my knowledge and belief. PRASHANT KUMAR (Bangaluru)

Timeline

Solution Architect

Sigmoid
03.2023 - Current

Senior Data Engineer

INTUIT
11.2021 - 03.2023

Applications Development Specialist 2

IQVIA
10.2019 - 11.2021

Sr. Software Engineer

Impetus
01.2019 - 11.2021

Senior Developer

Infosys Ltd
02.2014 - 05.2018

Intern

HPES
06.2012 - 08.2012

Bachelor of Engineering - Information Technology

IITM Gwalior (RGTU)
Prashant KumarBig Data Architect