Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Work Preference
Timeline
Generic

Parth Dodia

Data engineer
Jersey City,NJ

Summary

Four-year experience in designing and deploying scalable data pipelines, handling complex data integration in diverse cloud platforms Skilled in deploying and managing Relational and NoSQL databases, leveraging Elasticsearch for fast searching and analytics across datasets, integrating these technologies within cloud-based data pipelines for enhanced data accessibility and analysis Robust technical acumen in leveraging big data technologies such as Hadoop for distributed data management, Spark for efficient processing of large datasets in cluster environments, and Kafka for high-throughput real-time data streaming Specialized in optimizing ETL processes and data pipeline performance through meticulous tuning of Hadoop clusters and Spark jobs, ensuring minimal latency and maximum throughput in data operations

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer

Berkshire Hathaway
09.2024 - Current
  • Developed and integrated real-time dashboards by seamlessly connecting Tableau with SQL databases and various cloud platforms, providing executives with actionable insights that improved financial performance and market trend analysis
  • Designed and maintained high-performance data pipelines utilizing AWS S3, Glue, Lambda, and Redshift, achieving a 30% boost in the efficiency of ingesting and processing high-volume financial datasets
  • Engineered a data synchronization framework using Pandas and SQL Server Integration Services, which enhanced the consolidation of disparate financial data streams into a centralized warehouse, improving data accuracy and timeliness for risk assessment and portfolio management by 35%
  • Refined CI/CD deployment strategy by incorporating GitLab for version control, Docker for containerization, and Kubernetes for orchestration, which streamlined the entire lifecycle of financial data pipeline deployments, increasing deployment efficiency and operational agility by 40%
  • Monitored and debugged ETL workflows daily to identify and resolve anomalies, improving the reliability of the data pipeline and ensuring a 15% decrease in data processing errors, contributing to system optimization and overall performance
  • Collaborated with team, project manager and clients to gather business requirements and streamline data workflows, ensuring alignment with organizational goals and client needs resulting in a 15% improvement in project delivery time

Data Engineer

Aplus Datalytics
01.2019 - 08.2022
  • Developed and seamlessly integrated a machine learning pipeline using AWS SageMaker and Azure ML into the existing data architecture, enhancing predictive analytics accuracy by 15%
  • Designed and enforced a data governance framework leveraging Collibra and Apache Atlas, improving metadata management and data quality by 20%
  • Enhanced the data lake architecture with Amazon Redshift Spectrum and Azure Synapse Analytics, achieving a 30% increase in data processing efficiency through optimized partitioning and indexing
  • Implemented a robust multi-cloud data management strategy utilizing AWS RDS, Azure SQL Database, and Google Cloud Spanner, which increased system resilience and operational uptime by 25%, ensuring high availability and disaster recovery
  • Translated business requirements into tailored reports and dashboards using Power BI, delivering actionable insights on KPIs such as click-through rate (increased by 10%), operational efficiency (boosted by 8%), and revenue growth (up by 12%)

Education

Master of Science - Information Systems

Pace University
May 2024

Bachelor of Technology - Electronics and Telecommunications

Mumbai University
May 2020

Skills

  • Methodologies:
  • Agile, Waterfall
  • Language: Python, Scala, SQL
  • ETL Tools: Informatica, Talend, Alteryx
  • Databases:
  • MySQL, PostgreSQL, MongoDB, Snowflake, Oracle, Redshift
  • Big Data technologies: Apache Spark, Hadoop, Kafka, Airflow
  • Tools: Power Bi, Tableau, Git, Ansible, Chef, Eclipse, Jupyter Notebook, Cloud endure, Jira
  • Cloud Services: AWS (S3, Glue, Lambda), Azure (Data Factory, Synapse Analytics), GCP (BigQuery)

Accomplishments

  • Certifications: IBM Data Engineering Professional (Link), Google Analytics

Certification

Data Pipeline Orchestration, Data Warehousing, Data Modeling, Data Visualization

Work Preference

Work Type

Full Time

Location Preference

Hybrid

Timeline

Data Engineer

Berkshire Hathaway
09.2024 - Current

Data Engineer

Aplus Datalytics
01.2019 - 08.2022

Master of Science - Information Systems

Pace University

Bachelor of Technology - Electronics and Telecommunications

Mumbai University
Parth DodiaData engineer