Summary

Overview

Work History

Education

Skills

Certification

Timeline

PRASANTH MUTHUSAMY

Solutions Architect | Data Engineer |Digital Transformation

Bangalore,KA

Summary

Data & Cloud Architecture Leader with 10+ years of experience driving enterprise-scale platforms on AWS, Azure, Databricks, Snowflake and Spark. Skilled at leading 25+ member teams, defining data strategies, and delivering scalable lakehouse, streaming, and governance solutions with 20–30% cost savings. Proven record of business impact, growing accounts from $3.5M → ~$8M and enabling $600K+ in new analytics revenue, while pioneering GenAI/ML adoption.

Overview

years of professional experience

Certifications

Work History

Data Architect | Technology Manager

MathCo

03.2024 - Current

Led & scaled delivery: Managed a 25+ cross-functional engineering org (data eng, SRE, QA) to deliver copay enablement analytics; drove account revenue from \$3.5M → \~\$8M and closed \$600K+ in net-new analytics bookings.
Built production lakehouse pipelines: Architected multi-tenant ETL/ELT pipelines on AWS (S3, Glue/EMR, Lambda) using PySpark and loaded into Snowflake (Snowpipe/Streams) to enable near-real-time copay KPIs — reduced end-to-end latency from hours to minutes.
Streaming & realtime analytics: Designed streaming ingestion (Kinesis / Kafka + Spark Structured Streaming) and materialized views for intraday analytics, improving data freshness SLA from daily to intraday.
Security, governance & cost ops: Implemented Secrets Manager / KMS / IAM, RBAC and metadata/lineage flows (Unity/Atlan patterns); optimized compute, scheduling and storage which delivered significant cloud cost improvements (\~20–30%).
GenAI prototype (POC) → product roadmap: Delivered a secure chatbot POC (Amazon Lex, Bedrock, Lambda, Postgres, S3, and Secrets Manager) enabling conversational queries over raw transactional data and final metrics; validated use cases with stakeholders, and defined productionization requirements (vector store, RAG, provenance, and audit logging).

Consultant - BTS

ZS Associates

01.2021 - 03.2024

Large-scale ingestion & migration: Led end-to-end ingestion and migration initiatives (Redshift → Hadoop/EMR -> Snowflake), building resilient ingestion frameworks that processed multi-TB feeds with restartability and checkpointing.
Performance & optimization: Tuned Hive/Spark SQL and PySpark jobs (jobs processing \~2TB/run), cutting job runtimes and reducing downstream SLA misses through partitioning, predicate pushdown and executor tuning.
Implemented automated monitoring, alerting and self-healing pipelines (Airflow / Autosys / CloudWatch) with automated RCA, reducing Mean Time To Recovery and lowering incident volume.
Built test automation and CI/CD pipelines (Azure DevOps / GitHub Actions, Terraform), introduced data tests and regression suites — increased deployment confidence and reduced post-release issues.
Partnered with business teams to translate requirements into dimensional and canonical models, accelerating metric delivery and improving time-to-consumption for analytics teams.
Tech stack: Hadoop/EMR, Spark, Hive, PySpark, Redshift, Airflow, Autosys, Azure DevOps, SQL, Python,Power BI.

Technology Analyst

Infosys

04.2018 - 12.2020

Built end-to-end PySpark ETL on EMR + S3 processing ~200 GB/day, supporting 3+ analytics use cases and improving throughput ~3x.
Implemented streaming pipeline (Kinesis-> S3-> EMR-> PySpark-> Redshift) delivering near-real-time KPIs with Aggregated and normalized data from 8+ sources (APIs, RDBMS, CSVs, event streams) to produce daily metrics, reducing onboarding time to Implemented monitoring, alerting and restartability (CloudWatch + checkpoints + retry logic), cutting pipeline failures ~60% and MTTR ~50%.
Added automated ETL tests and CI (GitHub Actions), maintained runbooks and data docs, and collaborated with analysts to validate metrics and drive adoption.

Program Analyst

Cognizant

12.2015 - 02.2018

Migrated datasets from Redshift → HDFS using Sqoop, LFTP and custom connectors with full validation to ensure reliable downstream delivery.
Developed batch + incremental ingestion frameworks for RDBMS, FTP, APIs, and flat files, orchestrated via Airflow/Oozie and integrated into curated Hive/Impala tables.
Optimized Hive/Impala queries, schemas, and partitioning strategies, reducing query latency by ~50% and improving overall cluster performance.
Implemented incremental loads and checkpointing with Sqoop and Spark, minimizing reprocessing and ensuring consistent daily or near-real-time refreshes.
Built automated data-quality checks and regression test suites (SQL/Python) with alerting, reducing data incidents and improving resolution times

Education

Bachelor of Engineering -

RMKEC

Chennai, TN

05.2015

Skills

Python Spark

undefined

Certification

AWS Solution Architect

Timeline

Data Architect | Technology Manager

MathCo

03.2024 - Current

Consultant - BTS

ZS Associates

01.2021 - 03.2024

Technology Analyst

Infosys

04.2018 - 12.2020

Program Analyst

Cognizant

12.2015 - 02.2018

Bachelor of Engineering -

RMKEC

PRASANTH MUTHUSAMY

Summary

Overview

Work History

Data Architect | Technology Manager

Consultant - BTS

Technology Analyst

Program Analyst

Education

Bachelor of Engineering -

Skills

Certification

Timeline

Data Architect | Technology Manager

Consultant - BTS

Technology Analyst

Program Analyst

Bachelor of Engineering -

Similar Profiles

Subhadip GhoshalSubhadip Ghoshal

MAANVI ANCHANMAANVI ANCHAN

Kishor BhilareKishor Bhilare

Priya AshwathPriya Ashwath

Krishnan AKrishnan A