Summary
Overview
Work History
Education
Skills
Timeline
Generic

Kaustubh Katkar

Data Engineer
Pune

Summary

Detail-oriented data engineer designs, develops and maintains highly scalable, secure and reliable data structures. Accustomed to working closely with system architects, software architects and design analysts to understand business or industry requirements to develop comprehensive data models. Proficient at developing database architectural strategies at the modeling, design and implementation stages.

Overview

5
5
years of professional experience

Work History

Senior Analyst - HR

ADP
05.2021 - Current
  • Build HR Data Platform which will be Golden data source for Analytical & data science purpose
  • Setup pipelines between multiple SoR (SuccessFactor, GV, EV5 & GJA) and data platform and create multiple layers of data stages in Redshift DW
  • Setup automated Glue Pyspark jobs to ingest data into staging layer, integrate and load into curated layer and apply business logic and generate datasets for analytical and reporting purpose
  • Setup infra (EC2 box, Redshift cluster, S3 bucket, Glue jobs, IAM Roles & Policies & API Gateway) using CloudFormation scripts
  • Develop, Deploy and Operation support for Data science project
  • Implement data governance, data quality and data lineage process

Programmer Analyst

Cognizant
08.2020 - 05.2021
  • Build data pipeline from on-prem data sources to landing zone (AWS S3 bucket) and to cloud databases like Amazon Redshift and Snowflake.
  • Automate delta loads using AWS SNS, SQS, CloudWatch and Lambda with python scripts.
  • Build curated data layer over raw data stored within database using SQL stored procedures and schedule jobs.
  • Data archival & compression and ETL using AWS Athena and Glue.
  • Build tables and views using SQL stored procedure and develop MS PowerBI reports for Business reporting purpose.

Associate IT Developer

Medtronic
02.2019 - 08.2020
  • Built data pipeline from on-prem data sources to landing zone (AWS S3 bucket) and to cloud databases like Amazon Redshift and Snowflake.
  • Automated data loading from ftp services into SQL server using Python scripting.
  • Data modeling by joining multiple tables to extract & present data for business related KPIs in MS PowerBI.
  • Built automated Pipelines using Lambda, Glue & SQS for data ingestion in Redshift Datawarehouse

Education

PG Diploma in Big Data Analytics -

Centre For Development of Advanced Computing
Pune
04.2001 -

Bachelor of Computer Engineering -

University of Mumbai
Mumbai
04.2001 -

Skills

    Amazon Web Services

undefined

Timeline

Senior Analyst - HR

ADP
05.2021 - Current

Programmer Analyst

Cognizant
08.2020 - 05.2021

Associate IT Developer

Medtronic
02.2019 - 08.2020

PG Diploma in Big Data Analytics -

Centre For Development of Advanced Computing
04.2001 -

Bachelor of Computer Engineering -

University of Mumbai
04.2001 -
Kaustubh KatkarData Engineer