Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic

Shruthi DN

Data Engineer
Bengaluru

Summary

Senior Data Engineer specializing in data architecture and analytics solutions with 12+ years of experience. Delivered impactful results through expertise in Data Engineering, Data Warehousing, and Data Lakes, leveraging technologies such as Python, AWS, LLM, ETL, BI and SQL. Committed to driving efficiency using DevOps practices and Agile methodologies in data modeling and database design.

Overview

7
7
Certificates
12
12
years of professional experience

Work History

AWS/BI Consultant

Incept Data Solutions
02.2023 - Current

Client: Sanofi (CHC, BioPharma), Regeneron

DATA & AI ENGINEERING / DATA QUALITY PLATFORM (DATAORION)

1. Built DataOrion, an AI-powered data platform on Snowflake enabling automated, end-to-end AI-ready data preparation.
2. Developed LLM-driven metadata generation (BMG) integrated with Informatica CDGC using Snowflake cortex.
3. Designed Qualii, an AI-based Data Quality engine for natural language → SQL rule generation, LLM-driven, combining metadata, sample data, and user context.
4. Engineered scalable pipelines using Snowflake, Python, Streamlit, enabling agentic workflow orchestration.
5. Implemented config-driven DQ framework (EDG-DQ) supporting in-motion & at-rest validation.
6. Built SQL-based rule engine, error logging, and failed-record tracking for enterprise datasets.
7. Automated DQ score computation & publishing to CDGC, improving governance and catalog insights.
8. Integrated pipelines with Informatica CDI/IICS for orchestration and scheduling.
10. Built Streamlit UI + backend services for rule generation, deduplication, and SQL normalization.
11. Integrated Snowflake Cortex models for intelligent rule recommendations.

AWS SERVERLESS DQ & ANALYTICS

1. Built serverless DQ Score Utility (AWS Lambda, API Gateway, DynamoDB, S3, SNS).
2. Developed API-driven DQRO processing (GET/POST) with CDGC integration.
3. Implemented secure APIs (OAuth2/JWT via Layer7) and job tracking with DynamoDB.
3. Automated deployments using Terraform + GitHub Actions (CI/CD).
4. Engineered serverless pipelines for structural DQ checks integrated with Informatica ingestion flows.

POWER BI DATA QUALITY DASHBOARD

1. Built enterprise DQ dashboards using Snowflake + Power BI for real-time insights.
2. Developed KPIs, DAX, and data models for DQ score tracking and trend analysis.
3. Implemented RBAC and reusable templates for scalable deployments.

Client: TERADYNE


MELISSA ADDRESS VALIDATION

1. Built batch address validation pipeline using Melissa API (Python + Snowflake).
2. Implemented parallel processing, batching, and caching for performance optimization.
3. Mapped Melissa codes (AE/AC/AV/AS) into structured outputs
Built dynamic validation classification (Verified, Partial, Failed)
4. Enabled address standardization with error handling and retries

Senior Consultant

Ernst & Young
08.2022 - 02.2023

Provided expert consulting services to optimize client technology infrastructure.

  • Provisioned tech infrastructure, validated tech infra, and set up development environments for VDI and infra access
  • Tech infra provisioning, VDI and Infra Access, Validation of Tech Infra, Development Setup
  • Tracked project activities and facilitated client-facing calls to ensure alignment and address concerns
  • Started learning Pyspark and Azure Data Factory for data processing tasks.

Tech lead - Advanced Analytics

Hinduja Global Solutions
08.2019 - 07.2022
  • Created DataLake, Designed and developed AWS serverless ETL Pipelines for Operational Analytics dashboards on PowerBI.
  • Served as Scrum master, created and assigned Jira tasks to development team of 7 members to streamline project workflow.
  • Implemented automation of manual reports being generated by internal teams replacing the manual effort of 40 hours per week that reduced cost.
  • Implemented centralized repository for SQL server agent jobs metadata and job statistics and System performance metrics from On-Premises servers and helped Power BI team to implement the dashboard
  • Managed CI/CD deployments using Git, Bitbucket, Jenkins on EC2 and Terraform to ensure smooth release processes.
  • Conducted training sessions & mentoring to fresh recruits from university and to upskill existing team members on SQL and AWS Data Analytics services
  • Established centralized repository for SQL server agent jobs metadata, job statistics, and system performance metrics from on-premises servers, supporting Power BI team's dashboard implementation.

ETL Developer

IBM
02.2014 - 08.2019
  • Developed design documents, ETL design specifications, and data models (LDM/PDM) to support ETL processes.
  • ETL Development using SSIS (Domain: Automobile), Informatica (Work and Pension) and SQL.
  • Prepared the test strategy document
  • Conducted unit testing and supported system integration testing, developed and unit tested functionality based on design specifications, and migrated data from IDIT system to GENESIS system using DB2.
  • Worked on detail design documents, DBRs (Detailed Business Requirements)
  • Analyzed defects in TSRM.
  • Participated in build and deployment phases for creating packages in RTC and conducting smoke tests.

Education

BE - Bachelor of Compute Science

University Visvesvaraya College of Engineering
Bengaluru, KA

Skills

Business intelligence tools

ETL tools expertise

Data Pipeline Management

Database management systems

Programming Languages: Pyspark, Python, SQL, Unix Shell Scripting

Cloud and OS platforms

Azure services

Container Orchestration

Data modeling

Pharmaceutical industry expertise

Agentic AI: Snowflake Cortex LLM

Streamlit AI

Claude

Accomplishments

  • Eminence and Excellence Spark Award IBM, 2016
  • Best Contributor Award, IBM, 2019
  • Best Performer Award HGS, 2022

Certification

Completed AWS certification: AWS Associate developer

Timeline

AWS/BI Consultant

Incept Data Solutions
02.2023 - Current

Senior Consultant

Ernst & Young
08.2022 - 02.2023

Tech lead - Advanced Analytics

Hinduja Global Solutions
08.2019 - 07.2022

ETL Developer

IBM
02.2014 - 08.2019

BE - Bachelor of Compute Science

University Visvesvaraya College of Engineering
Shruthi DNData Engineer