Adept Sr. Data Engineer with a proven track record at Tiger Analytics, leveraging Python, SQL, and AWS to architect and optimize data pipelines for enhanced decision-making. Demonstrated leadership in migrating data to cloud platforms, improving data quality by 30%, and mentoring teams. Excels in technical innovation and strategic problem-solving.
Overview
9
9
years of professional experience
1
1
Certification
Work History
Sr. Data Engineer
Tiger Analytics
06.2022 - Current
Develop a pipeline using databricks & sqlserver , which is understand more about customers
Leveraged Databricks for Extract, Transform, Load (ETL) processes to effectively process and prepare data for analysis
Developed different survey pipelines using SQL server & databricks and Generate the sample
To validate the data quality
Designed and implemented Databricks workflows to orchestrate ETL jobs seamlessly
Utilized snowflake as the data warehousing solution, ensuring high-performance analytics and data accessibility
Engineered ETL processes to extract raw data from S3 perform transformation and create structured table for analysis
Developed a data quality framework and quality check to ensure the accuracy and consistency of process data
Fine-tuned ETL processes and databricks notebook jobs enhance data processing speed and reduce latency
Sr. Software Engineer
TSYS, A GLOBAL PAYMENTS COMPANY.
Noida
06.2020 - 06.2022
Objective is to provide on premise mainframe data lake to AWS delta lake migration in AWS Cloud, creating daily and monthly ETL pipelines and handling metadata management
All data models generation, copybook parsing, database crawler and all artifacts generated via PYTHON3
Our developed application supports various multiple data sources like [ MAINFRAME, ORACLE]
Different JENKINS pipelines are developed which are used to deploy the application, generate the artifacts, crawl database, registering application
SQDATA script writing which connect with mainframe and push the data back to Kafka topic
All application is developed with DOCKER container, and deployed too which runs on K8S cluster, which runs on AWS
REAL TIME processing is done via PYSPARK application, which reads data from Kafka and transformed it, store into S3 Bucket
Compaction services run every hour, which optimize the tables
Validation services run every hour, which return counts, lag of data between two different engines
Written Pytest scripts to test the functionality of Python scripts before deploying to Production
Reprised the role of team lead for 1.5 years and provided technical mentorship to junior team members, performed code & design reviews as well as enforced coding standards and best practices
As agile practitioner, Involved in Daily scrum, Backlog Refinement, Sprint planning, Sprint Review, Sprint Retrospective meetings to achieve sprint goals
SR. SOFTWARE ENGINEER
MOTHERSONSUMI INFOTECH AND DESIGN LIMITED
01.2017 - 06.2020
The objective is to provide cost effective cloud solution to client in order migrate their IBM cloud to AWS cloud, to achieve high availability and fault tolerance for their application
Designed Architecture using the AWS tools for implementation of the solution
Setting AWS Lake Formation to automate the process (S3 Permission, Job’s)
Setting up EMR cluster and run spark job to move data from HDFS to EMRFS
Lambda to automate the process to bring the incremental data
Setting up RCLONE in EC2 machine, create multiple remote to bring data from IBM cos to AWS S3
Create IAM Roles/User to provide the specific access to user
Run Juypter notebook to run SPARK SQL query in JAVA / PYTHON
Creation of various dashboard for real-time processing, near-real time processing and batch processing
SOFTWARE ENGINEER
MOTHERSONSUMI INFOTECH AND DESIGN LIMITED
12.2015 - 12.2016
Various applications of the organization are hosted on AWS cloud
For automating various tasks, lambda plays an important role
Automation of getting snapshots daily over the retention period of seven days
Automation of reading mails through SNS topic and saving logs into S3 bucket