

Highly motivated Senior Data Engineer with 5+ years of experience architecting, scaling, and optimizing high-throughput data pipelines and modern cloud-native data warehouses/data lakes. Expert in translating complex business requirements into scalable, analytics-ready technical solutions across GCP, AWS, and hybrid environments. Proven track record in designing dimensional models and managing petabyte-scale, real-time, and batch data streams to drive critical corporate decision-making. Strong communicator skilled at partnering with global cross-functional stakeholders and engineering teams in rapidly changing environments.
• Architected high-volume ETL pipelines using AWS, Kafka, Python, Pandas, and Snowflake, cutting processing time by 30%.
• Automated data ingestion from Outlook and SharePoint APIs with Python, saving 40% manual effort.
• Designed audit logging, monitoring, and alerting frameworks for proactive data quality management.
• Automated Power BI dataset refreshes via Airflow, reducing manual overhead by ~20 hours/month.
• Built CloudFormation templates for data infrastructure provisioning, enabling scalable and cost-efficient pipelines.
• Standardized business logic and transformations via dbt, improving maintainability and consistency across models.
• Built and maintained scalable ETL pipelines using SQL, Spark, and Databricks, integrating data from multiple enterprise systems into analytics-ready datasets.
• Designed dimensional data models (star/snowflake schemas) to support enterprise BI reporting and analytics.
• Optimized Teradata queries, reducing execution times by up to 25% for high-volume analytical workloads.
• Enhanced pipelines to support data governance, lineage, and auditability for compliance-driven projects.
• Collaborated with cross-functional stakeholders in the Modern Analytics Insights Team to deliver new features and improvements
Programming & Data: Python, SQL (Advanced), Scala, Pandas, Spark, Teradata, REST API Integration
Cloud Platforms & Infrastructure: AWS (S3, Lambda, CloudFormation), GCP (BigQuery, Pub/Sub, Dataflow, Cloud Composer, GCS), Hybrid Infrastructure, Snowflake, Databricks
ETL / Real-Time Pipelines: Kafka, Airflow, dbt, Buildkite, Jenkins, Docker, Kubernetes
Data Architecture & Science Readiness: Dimensional Data Modeling (Star/Snowflake), Modern Cloud Data Warehouses, Data Lakes, Schema Enforcement, CDC, Data Governance, Data Quality & Lineage, Machine Learning/Data Science Feature Support
DevOps & CI/CD: Terraform, Git, Containerization, Infrastructure as Code (IaC)
Hacker Rank certified in Python Programming