Summary
Overview
Work History
Education
Skills
Certification
Recognition Awards
Timeline
Generic

CHANAKYA SHARMA

Senior Software Engineer
Hyderabad

Summary

Experienced Data Engineer with 8+ years of expertise in designing, developing, and maintaining high-performance data architectures and ETL pipelines. Specializing in Snowflake development and leveraging AWS services—including Step Functions, EMR, S3, and Glue—to build scalable, automated solutions that integrate diverse data sources.

Overview

9
9
years of professional experience
4
4
years of post-secondary education
2
2
Certifications

Work History

Senior Software Engineer

Carelon Global Solutions
07.2021 - Current

Apree Health Value Based Care

  • Architected an end-to-end ETL pipeline leveraging AWS services including Step Functions, EMR, PySpark, SQL, and Snowflake DB to automate the identification of enrolled members eligible for care interventions under Apree Health's Value-Based Care program.
  • It's a decoupled framework where each processing stage—ranging from data ingestion to SQL transformations, file generation, and FTP delivery—can be independently restarted from the point of failure. This resiliency is achieved using modular AWS Glue jobs that handle discrete process segments (Ingestion, SQL, Adding Headers/Trailers, CFX processing).
  • Integrated asynchronous AWS Step Functions to orchestrate the ETL workflow. At every phase completion (e.g., data ingestion, SQL queries, file generation, FTP transmission), the pipeline updates an audit table, ensuring full traceability and process transparency.
  • Implemented robust error-handling mechanisms with automated email alerts triggered upon any process failure. Each alert includes the precise failure step and error details, with direct links to AWS Cloud Watch logs for prompt troubleshooting and remediation.

Medicare Customer Analytics Platform

  • Engineered an advanced ETL pipeline that retrieves near real-time data from Snowflake to consolidate Medicare member journey survey data, chat surveys and transcripts, dis-enrollment records, and mock CAHPS survey data, integrating these diverse sources into an analytics platform.
  • Automated the extraction process where processed data is stored in Amazon S3 buckets, serving as the centralized data lake from which the analytics platform accesses the latest insights for proactive member outreach.
  • Utilized AWS Step Functions to manage interdependencies between data feeds, orchestrating sequential execution with asynchronous triggers to ensure orderly and efficient processing. Implemented robust data scrubbing mechanisms using Snowflake tokenisation policies to filter out invalid phone numbers, ensuring high data quality and compliance.
  • Applied PySpark DataFrame functions to append file counts at the end of each record, enabling cross-verification of ingested record counts against expected file counts for enhanced data accuracy. Leveraged persistent Snowflake tables to maintain state, filtering out previously processed records so that only the latest survey data is ingested, thereby eliminating duplicates and ensuring freshness of insights.

Department Of Justice Litigation False Claims Act

  • Architected and implemented a robust ETL pipeline using Informatica PowerCenter and Teradata to extract Medicare claims data for DOJ investigations, addressing allegations of false claims submitted to CMS related to historic Medicare Advantage.
  • Leveraged Teradata Parallel Transporter (TPT) connections to optimize the extraction of large datasets, ensuring efficient processing and minimal latency. Tuned complex SQL queries by analyzing explain plans to evaluate byte consumption and execution time, driving significant performance improvements
  • Implemented transaction control mechanisms to commit monthly data changes, automatically generating separate output files for each period to maintain data integrity.
  • Orchestrated the entire ETL process using Control-M (CTM), employing watcher jobs and auto count jobs to monitor execution frequencies and ensure consistent job performance across multiple years. This solution is not only streamlined data processing but also ensured compliance and operational efficiency for legal and regulatory reviews.

I.T Analyst

Tata Consultancy Services
07.2016 - 07.2021

CMS Mandate Data Quality & Compliance

  • Architected an ETL pipeline using Informatica Power Center to process and cleanse data in compliance with CMS mandates. Leveraged built-in transformations to perform null checks, case normalization, removal of special characters, and filtering of extraneous data.
  • Designed and stored a suite of SQL queries in an Oracle database to execute granular field-level and record-level data quality checks. These validations are dynamically mapped to specific column names and check types, and their execution is fully automated via PL/SQL scripts.
  • Integrated SSRS reporting to generate detailed compliance reports upon completion of the data cleansing process, highlighting compliance ratios across each category of data quality checks. This enables rapid identification of anomalies and supports continuous process improvement.
  • This comprehensive, CMS-compliant framework ensures high data quality and robust compliance monitoring, facilitating proactive remediation and data integrity.

CDAG/ODAG Universe Generation

  • Spearheaded the generation of CDAG (Core Data Aggregation Group) and ODAG (Operational Data Aggregation Group) universes, integrating data from distinct feeds. Leveraged advanced data modeling and ETL optimization techniques to harmonize disparate data sources, ensuring comprehensive solution using Informatica, Oracle DB and SSRS reports.
  • Orchestrated the end-to-end process using the Univiewer Console for scheduling and automated execution. The pipeline ensures sequential data quality checks across each feed, culminating in detailed SSRS reports that showcase compliance ratios and highlight data anomalies.
  • Maintained a robust star schema leveraging provided enrollment tables to correlate and extract essential data elements across the pipeline. Integrated fact tables—encompassing claims, enrollments, and transactional records—with corresponding dimension tables (e.g., member demographics, provider details, and time/date attributes). This design streamlined data joins and optimized query performance, ensuring high-fidelity data extraction for downstream processes

Education

Bachelor - Electronics and Communication

Jawaharlal Nehru University Of Technology
Hyderabad, India
04.2012 - 04.2016

Skills

Python

Pandas

SQL

Oracle

Teradata

Snowflake

Informatica Power centre

Toad

Ctrl-M

undefined

Certification

SnowPro Core Certified

Recognition Awards

  • Honored with Impact Award for successfully delivering a critical data extract related to litigation on claims processed.
  • Participated in a tower-level hackathon event, contributing innovative automation ideas using cloud technologies.
  • Became familiar with industry standards and regulations (HIPAA, CMS, DOI).

Timeline

Senior Software Engineer

Carelon Global Solutions
07.2021 - Current

I.T Analyst

Tata Consultancy Services
07.2016 - 07.2021

Bachelor - Electronics and Communication

Jawaharlal Nehru University Of Technology
04.2012 - 04.2016
CHANAKYA SHARMASenior Software Engineer