Summary
Overview
Work History
Skills
Certification
Timeline
Generic
RAJBEER SINGH

RAJBEER SINGH

Uttarakhand

Summary

Adept Sr. Data Engineer with a proven track record at Tiger Analytics, leveraging Python, SQL, and AWS to architect and optimize data pipelines for enhanced decision-making. Demonstrated leadership in migrating data to cloud platforms, improving data quality by 30%, and mentoring teams. Excels in technical innovation and strategic problem-solving.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Sr. Data Engineer

Tiger Analytics
06.2022 - Current
  • Develop a pipeline using databricks & sqlserver , which is understand more about customers
  • Leveraged Databricks for Extract, Transform, Load (ETL) processes to effectively process and prepare data for analysis
  • Developed different survey pipelines using SQL server & databricks and Generate the sample
  • To validate the data quality
  • Designed and implemented Databricks workflows to orchestrate ETL jobs seamlessly
  • Utilized snowflake as the data warehousing solution, ensuring high-performance analytics and data accessibility
  • Engineered ETL processes to extract raw data from S3 perform transformation and create structured table for analysis
  • Developed a data quality framework and quality check to ensure the accuracy and consistency of process data
  • Fine-tuned ETL processes and databricks notebook jobs enhance data processing speed and reduce latency

Sr. Software Engineer

TSYS, A GLOBAL PAYMENTS COMPANY.
Noida
06.2020 - 06.2022
  • Objective is to provide on premise mainframe data lake to AWS delta lake migration in AWS Cloud, creating daily and monthly ETL pipelines and handling metadata management
  • All data models generation, copybook parsing, database crawler and all artifacts generated via PYTHON3
  • Our developed application supports various multiple data sources like [ MAINFRAME, ORACLE]
  • Different JENKINS pipelines are developed which are used to deploy the application, generate the artifacts, crawl database, registering application
  • SQDATA script writing which connect with mainframe and push the data back to Kafka topic
  • All application is developed with DOCKER container, and deployed too which runs on K8S cluster, which runs on AWS
  • REAL TIME processing is done via PYSPARK application, which reads data from Kafka and transformed it, store into S3 Bucket
  • Compaction services run every hour, which optimize the tables
  • Validation services run every hour, which return counts, lag of data between two different engines
  • Written Pytest scripts to test the functionality of Python scripts before deploying to Production
  • Reprised the role of team lead for 1.5 years and provided technical mentorship to junior team members, performed code & design reviews as well as enforced coding standards and best practices
  • As agile practitioner, Involved in Daily scrum, Backlog Refinement, Sprint planning, Sprint Review, Sprint Retrospective meetings to achieve sprint goals

SR. SOFTWARE ENGINEER

MOTHERSONSUMI INFOTECH AND DESIGN LIMITED
01.2017 - 06.2020
  • The objective is to provide cost effective cloud solution to client in order migrate their IBM cloud to AWS cloud, to achieve high availability and fault tolerance for their application
  • Designed Architecture using the AWS tools for implementation of the solution
  • Setting AWS Lake Formation to automate the process (S3 Permission, Job’s)
  • Setting up EMR cluster and run spark job to move data from HDFS to EMRFS
  • Lambda to automate the process to bring the incremental data
  • Setting up RCLONE in EC2 machine, create multiple remote to bring data from IBM cos to AWS S3
  • Create IAM Roles/User to provide the specific access to user
  • Run Juypter notebook to run SPARK SQL query in JAVA / PYTHON
  • Creation of various dashboard for real-time processing, near-real time processing and batch processing

SOFTWARE ENGINEER

MOTHERSONSUMI INFOTECH AND DESIGN LIMITED
12.2015 - 12.2016
  • Various applications of the organization are hosted on AWS cloud
  • For automating various tasks, lambda plays an important role
  • Automation of getting snapshots daily over the retention period of seven days
  • Automation of reading mails through SNS topic and saving logs into S3 bucket
  • Server less handling of DB backup

Skills

  • Python
  • SQL
  • Pyspark
  • Spark
  • Hadoop
  • AWS
  • ECS
  • Postgres
  • SQLServer
  • Jenkins
  • Airflow
  • Docker
  • Snowflake
  • GenAI
  • PowerBi

Certification

PYTHON Entry Level Certification

Timeline

Sr. Data Engineer

Tiger Analytics
06.2022 - Current

Sr. Software Engineer

TSYS, A GLOBAL PAYMENTS COMPANY.
06.2020 - 06.2022

SR. SOFTWARE ENGINEER

MOTHERSONSUMI INFOTECH AND DESIGN LIMITED
01.2017 - 06.2020

SOFTWARE ENGINEER

MOTHERSONSUMI INFOTECH AND DESIGN LIMITED
12.2015 - 12.2016

PYTHON Entry Level Certification

RAJBEER SINGH