Shrinivas Angadi

Xoriant

11.2020 - Current

Data Governance (Client, Citibank) -

Maintained and enhanced ETL pipelines integrating data from diverse sources including flat files, relational databases (DBMS), Kafka topics, REST APIs, and other external systems.
Optimized Spark code to efficiently handle large-scale data processing, resulting in a 4 times improvement in system resilience and reducing time take by 70%.
Enhanced Scala Load Utility Script to support additional source types (SQL Server, Oracle, Postgres) with robust read/write capabilities for seamless data transfer.
Automated ad-hoc data transfer requests, reducing turnaround time by 80%.
Created a shell script to periodically archive older partitions within Hive, guaranteeing data management and storage optimization. .
Recently started working of spark real time streaming using Spark Structured Streams to process the real time data from Kafka Sources

Data Profiling (Client, Citibank) -

Developed a Data Profiling tool to automate application onboarding on the ETL pipeline, reducing setup time from 4 days to minutes.
Automated config generation, validation, and storage, with schema support for RDBMS, Kafka, and file sources.
Reduced manual effort by 70% and improved onboarding speed by 90%, enabling 100+ applications to be onboarded efficiently.

Proof of Concepts (Xoriant) -

Crafted a state-of-the-art schedu-lo-bot to fetch job-related information to the end user saving a significant amount of time in redundant communications.
Implemented a Change Data Capture (CDC) stream to efficiently transfer data from multiple sources to Kafka, and subsequently to AWS S3 for long-term storage and analysis.
Incorporated workflow scheduling and management using Airflow. Created and maintained various Directed Acyclic Graphs (DAGs) to automate tasks, resulting in improved efficiency and reduced manual errors.

Similar Profiles