Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic
PRASANTH MUTHUSAMY

PRASANTH MUTHUSAMY

Solutions Architect | Data Engineer |Digital Transformation
Bangalore,KA

Summary

Data & Cloud Architecture Leader with 10+ years of experience driving enterprise-scale platforms on AWS, Azure, Databricks, Snowflake and Spark. Skilled at leading 25+ member teams, defining data strategies, and delivering scalable lakehouse, streaming, and governance solutions with 20–30% cost savings. Proven record of business impact, growing accounts from $3.5M → ~$8M and enabling $600K+ in new analytics revenue, while pioneering GenAI/ML adoption.

Overview

10
10
years of professional experience
2
2
Certifications

Work History

Data Architect | Technology Manager

MathCo
03.2024 - Current
  • Led & scaled delivery: Managed a 25+ cross-functional engineering org (data eng, SRE, QA) to deliver copay enablement analytics; drove account revenue from \$3.5M → \~\$8M and closed \$600K+ in net-new analytics bookings.
  • Built production lakehouse pipelines: Architected multi-tenant ETL/ELT pipelines on AWS (S3, Glue/EMR, Lambda) using PySpark and loaded into Snowflake (Snowpipe/Streams) to enable near-real-time copay KPIs — reduced end-to-end latency from hours to minutes.
  • Streaming & realtime analytics: Designed streaming ingestion (Kinesis / Kafka + Spark Structured Streaming) and materialized views for intraday analytics, improving data freshness SLA from daily to intraday.
  • Security, governance & cost ops: Implemented Secrets Manager / KMS / IAM, RBAC and metadata/lineage flows (Unity/Atlan patterns); optimized compute, scheduling and storage which delivered significant cloud cost improvements (\~20–30%).
  • GenAI prototype (POC) → product roadmap: Delivered a secure chatbot POC (Amazon Lex, Bedrock, Lambda, Postgres, S3, and Secrets Manager) enabling conversational queries over raw transactional data and final metrics; validated use cases with stakeholders, and defined productionization requirements (vector store, RAG, provenance, and audit logging).

Consultant - BTS

ZS Associates
01.2021 - 03.2024
  • Large-scale ingestion & migration: Led end-to-end ingestion and migration initiatives (Redshift → Hadoop/EMR -> Snowflake), building resilient ingestion frameworks that processed multi-TB feeds with restartability and checkpointing.
  • Performance & optimization: Tuned Hive/Spark SQL and PySpark jobs (jobs processing \~2TB/run), cutting job runtimes and reducing downstream SLA misses through partitioning, predicate pushdown and executor tuning.
  • Implemented automated monitoring, alerting and self-healing pipelines (Airflow / Autosys / CloudWatch) with automated RCA, reducing Mean Time To Recovery and lowering incident volume.
  • Built test automation and CI/CD pipelines (Azure DevOps / GitHub Actions, Terraform), introduced data tests and regression suites — increased deployment confidence and reduced post-release issues.
  • Partnered with business teams to translate requirements into dimensional and canonical models, accelerating metric delivery and improving time-to-consumption for analytics teams.
  • Tech stack: Hadoop/EMR, Spark, Hive, PySpark, Redshift, Airflow, Autosys, Azure DevOps, SQL, Python,Power BI.

Technology Analyst

Infosys
04.2018 - 12.2020
  • Built end-to-end PySpark ETL on EMR + S3 processing ~200 GB/day, supporting 3+ analytics use cases and improving throughput ~3x.
  • Implemented streaming pipeline (Kinesis-> S3-> EMR-> PySpark-> Redshift) delivering near-real-time KPIs with Aggregated and normalized data from 8+ sources (APIs, RDBMS, CSVs, event streams) to produce daily metrics, reducing onboarding time to Implemented monitoring, alerting and restartability (CloudWatch + checkpoints + retry logic), cutting pipeline failures ~60% and MTTR ~50%.
  • Added automated ETL tests and CI (GitHub Actions), maintained runbooks and data docs, and collaborated with analysts to validate metrics and drive adoption.

Program Analyst

Cognizant
12.2015 - 02.2018
  • Migrated datasets from Redshift → HDFS using Sqoop, LFTP and custom connectors with full validation to ensure reliable downstream delivery.
  • Developed batch + incremental ingestion frameworks for RDBMS, FTP, APIs, and flat files, orchestrated via Airflow/Oozie and integrated into curated Hive/Impala tables.
  • Optimized Hive/Impala queries, schemas, and partitioning strategies, reducing query latency by ~50% and improving overall cluster performance.
  • Implemented incremental loads and checkpointing with Sqoop and Spark, minimizing reprocessing and ensuring consistent daily or near-real-time refreshes.
  • Built automated data-quality checks and regression test suites (SQL/Python) with alerting, reducing data incidents and improving resolution times

Education

Bachelor of Engineering -

RMKEC
Chennai, TN
05.2015

Skills

Python Spark

undefined

Certification

AWS Solution Architect

Timeline

Data Architect | Technology Manager

MathCo
03.2024 - Current

Consultant - BTS

ZS Associates
01.2021 - 03.2024

Technology Analyst

Infosys
04.2018 - 12.2020

Program Analyst

Cognizant
12.2015 - 02.2018

Bachelor of Engineering -

RMKEC
PRASANTH MUTHUSAMYSolutions Architect | Data Engineer |Digital Transformation