Overview
Work History
Education
Skills
Websites
Awards
Timeline
Generic

Pradeep B Y

Lead/Senior Data Engineer
Bangalore,Karnataka

Overview

12
12
years of professional experience
2
2
years of post-secondary education

Work History

Senior/Lead Data Engineer

Cisco
Bangalore
09.2022 - Current

Developed an end-to-end data migration pipeline with minimal downtime using Airflow, DMS, and Apache Spark.

  • Managed the migration of 20 million user records from MariaDB to AWS RDS with zero downtime.
  • Worked with open-source Temporal to build workflows for asynchronous tasks.
  • Worked on optimization of the spark jobs in the BI pipeline.
  • Aligned business objectives with technical requirements through close collaboration with product owners providing insights based on available data sources.

Lead/Senior Data Engineer - Data Platforms

Dataweave
Bangalore
3 2019 - 09.2022
  • Dataweave empowers Strategic Decision making using data available in the public domain
  • Building highly available large-scale distributed systems, data pipelines, and serverless systems on Multi-cloud (AWS, Azure)
  • Deployed and managed Kafka-cluster to handle high volume data, Solrcloud to identify similar documents across sources
  • Built efficient ETL pipelines using pyspark/athena/s3 and integrated it with the apache airflow
  • Worked on different file formats like , parquet, orc etc
  • Built Hive data warehouse using AWS S3 and Qubole/EMR for historical data analysis
  • Involved in data modelling and query level optimisations for better performance
  • Built the large-scale web aggregation and distributed processing engine with the capability to aggregate and process web data with more than a billion data points a day
  • Worked on dynamodb for querying historical data in timeseries fashion, designed schema and other policies like retention period,autoscaling etc
  • Built efficient sql ingestion system using sqoop incremental job to ingest mysql data to hive data warehouse for the analytical purpose.

Data Engineer - Platforms & Delivery

Dataweave
07.2017 - 02.2019
  • Contributed to the generic web crawling framework that handles 4000+ websites and more than 10+ verticals
  • Involved in designing and building the config manager using django which helps to store the dynamic crawl and extraction information of a website
  • Exposed API for the data to the internal systems
  • Involved in designing and building internal dashboard using django that is used to monitor the crawl jobs and take actions based on the crawl and extracted data from the distributed crawler
  • Generated insights on the web aggregation engine metadata using Elasticsearch, Redash, and Google Data-studio
  • Developed Proxy Management system for large-scale web crawling from the public web efficiently
  • Setting up deployments(CI/CD) using python fabric, bitbucket pipelines, and AWS code deploy
  • Designed and developed mysql database that is used in our internal dashboards / API
  • Used Redis as a cache for better performance.

System Engineer

Tata Consultancy Services
08.2012 - 06.2015
  • Part of Jaguar Land Rover Infotainment Development Team
  • Involved in building Next Generation in vehicle Infotainment, was part of HMI Team
  • Developed GUI and Business Logic for tuner features DAB and Radio in C++ using the tools Rhapsody and GL Studio
  • Performed Manual Testing for the applications developed, involves Unit Testing, Code Testing and Module Testing.

Education

M.Tech - Computer Science

IIIT-B
Bangalore
07.2015 - 05.2017

Skills

Kafka

Hive/Athena

PySpark

Apache Airflow

MySql/Postgres

DynamoDB

Sqoop

Linux/Shell script

ZooKeeper

AWS S3, DMS, Athena

AWS Lambda

AWS/Azure

Redis/Memcache

Data Modeling

Information Retrieval

BigData

Performance Tuning

ETL development

SQL and NoSQL

Data Pipeline Design

Awards

  • Key Contributor Award, Dataweave, 04/2019, 07/2019, For the contribution in distributed crawling.
  • Team Award, Dataweave, 10/2020, 12/2020, Data Platforms Team.
  • DBS Hack2Hire - 2nd Runner Up, DBS Bank, 04/2017, 04/2017, 48-hour hackathon by DBS in Hyderabad. It received more than 12,000 applicants for online prelims.

Timeline

Senior/Lead Data Engineer

Cisco
09.2022 - Current

Data Engineer - Platforms & Delivery

Dataweave
07.2017 - 02.2019

M.Tech - Computer Science

IIIT-B
07.2015 - 05.2017

System Engineer

Tata Consultancy Services
08.2012 - 06.2015

Lead/Senior Data Engineer - Data Platforms

Dataweave
3 2019 - 09.2022
Pradeep B YLead/Senior Data Engineer