Summary
Overview
Work History
Education
Skills
Websites
Certification
Languages
Timeline
Generic

Jyothi A

Bengaluru

Summary

Data Engineer with expertise in designing, developing, and optimizing data pipelines, ensuring efficient data flow and high-quality insights. Proficient in SQL, Python, cloud platforms, and big data technologies to support data-driven decision-making.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer (HP)

Mphasis
Bengaluru
08.2023 - Current
  • Built and maintained ETL/ELT pipelines in AWS Databricks, ingesting structured and semi-structured data from S3, SQL/Oracle to Delta tables, Redshift, and Unity Catalog.
  • Developed common re-usable code to extract data from various sources.
  • Developed incremental data processing in Databricks using merge, partitioning, and optimized file formats (Delta) to improve performance and reduce costs.
  • Tuned Spark configurations (shuffle partitions, caching, and auto-scaling clusters) for optimized job execution in AWS Databricks.
  • Set up real-time monitoring using Splunk dashboards to track data pipeline failures and performance issues.
  • Debugged and resolved S3 permission issues, job failures, and Databricks notebook errors, ensuring seamless data processing.
  • Optimized long-running Spark jobs by tuning shuffle partitions, broadcast joins, and caching strategies, reducing execution time and resource consumption.
  • Collaborated with source teams to handle late-arriving data and schema evolution, ensuring smooth data ingestion.
  • Migrated Redshift data to Unity Catalog, deprecating legacy Redshift-based access controls and implementing fine-grained permissions using Unity Catalog for enhanced security and governance.

Data Engineer

Cognizant Technology Solutions
Bengaluru
03.2022 - 05.2023
  • Worked on migration of data from On-prem SQL server to Azure Cloud databases.
  • Created ADF pipelines to extract data from on premises source systems to azure cloud data lake storage.
  • Extensively worked on copy activities and implemented the copy behaviours such as flatten hierarchy, preserve hierarchy and Merge hierarchy. Implemented Error Handling concept through copy activity.
  • Developed Spark notebooks to transform and partition the data and organize files in ADLS. Worked on Azure Data bricks to run Spark-Python Notebooks through ADF pipelines.
  • Created Linked Services for multiple source systems i.e. Azure SQL Server, ADLS and blob.
  • Configured the logic apps to handle email notification to the end users and key shareholders with the help of web services activity. Created a dynamic pipeline to handle multiple source extraction to multiple targets; extensively used azure key vaults to configure the connections in linked services.
  • Configured the logic apps to handle email notification to the end users and key shareholders with the help of web services activity.
  • Created a dynamic pipeline to handle multiple source extraction to multiple targets; extensively used azure key vaults to configure the connections in linked services.
  • Developed data bricks notebook to perform data cleaning and transformation on various tables using Spark SQL and Pyspark. Developed and maintained CI/CD pipelines for Azure Data Factory (ADF) pipelines using tools like Azure DevOps and Git.

Data Engineer

Syren technologies
Bengaluru
06.2021 - 02.2022
  • Worked on data ingestion from excel files and mails into Azure Blob Storage and processed the data using Azure Databricks.
  • Developed Spark notebooks in Databricks to clean, transform, and partition the data before loading it into the final SQL database in Azure.
  • Ensured schema validation and data consistency by performing data profiling activities, including checking data types, null values, and anomalies before final loading.
  • Optimized data transformation logic by implementing efficient joins, caching strategies, and partitioning techniques in Databricks to improve performance.
  • Monitored and troubleshooted Databricks job failures, ensuring smooth data processing and resolving schema-related issues during ingestion and transformation.

Analyst

Embibe
08.2019 - 06.2021
  • Gathered data from various sources, such as student information systems, learning management systems, and external data providers, ensuring data integrity and accuracy.
  • Developed and maintained data ingestion processes to upload raw data from different sources to Azure Blob storage, ensuring timely and reliable data availability.
  • Performed data mapping from raw data to master data structures, ensuring consistent and standardized data across different systems and databases.
  • Implemented data cleansing techniques to identify and rectify data quality issues, including data validation, outlier detection, and data transformation.
  • Conducted data aggregations and summarizations to generate meaningful insights and reports for key stakeholders, supporting data-­driven decision­-making processes.

Education

B.Tech -

Alliance University
06.2018

Pre-University -

Vishwa Chethana PU College
04.2014

SSLC -

Jayashree Education Trust
04.2012

Skills

  • AWS/Azure Databricks
  • Azure Blob Storage/S3
  • Azure DevOps
  • Delta Lake
  • Pyspark
  • SQL
  • Python
  • Redshift
  • GitHub

Certification

  • Microsoft, 06/30/23, Microsoft certified Azure Data Engineer
  • Udemy, 05/05/23, Azure Data Factory for Data Engineers
  • Udemy, 03/30/23, Azure Databricks and Spark for Data Engineers

Languages

  • English
  • Kannada
  • Hindi
  • Telugu
  • German

Timeline

Data Engineer (HP)

Mphasis
08.2023 - Current

Data Engineer

Cognizant Technology Solutions
03.2022 - 05.2023

Data Engineer

Syren technologies
06.2021 - 02.2022

Analyst

Embibe
08.2019 - 06.2021
  • Microsoft, 06/30/23, Microsoft certified Azure Data Engineer
  • Udemy, 05/05/23, Azure Data Factory for Data Engineers
  • Udemy, 03/30/23, Azure Databricks and Spark for Data Engineers

B.Tech -

Alliance University

Pre-University -

Vishwa Chethana PU College

SSLC -

Jayashree Education Trust
Jyothi A