Four-year experience in designing and deploying scalable data pipelines, handling complex data integration in diverse cloud platforms
Skilled in deploying and managing Relational and NoSQL databases, leveraging Elasticsearch for fast searching and analytics across
datasets, integrating these technologies within cloud-based data pipelines for enhanced data accessibility and analysis
Robust technical acumen in leveraging big data technologies such as Hadoop for distributed data management, Spark for efficient
processing of large datasets in cluster environments, and Kafka for high-throughput real-time data streaming
Specialized in optimizing ETL processes and data pipeline performance through meticulous tuning of Hadoop clusters and Spark jobs,
ensuring minimal latency and maximum throughput in data operations
Overview
6
6
years of professional experience
1
1
Certification
Work History
Data Engineer
Berkshire Hathaway
09.2024 - Current
Developed and integrated real-time dashboards by seamlessly connecting Tableau with SQL databases and various cloud platforms,
providing executives with actionable insights that improved financial performance and market trend analysis
Designed and maintained high-performance data pipelines utilizing AWS S3, Glue, Lambda, and Redshift, achieving a 30% boost in
the efficiency of ingesting and processing high-volume financial datasets
Engineered a data synchronization framework using Pandas and SQL Server Integration Services, which enhanced the consolidation
of disparate financial data streams into a centralized warehouse, improving data accuracy and timeliness for risk assessment and
portfolio management by 35%
Refined CI/CD deployment strategy by incorporating GitLab for version control, Docker for containerization, and Kubernetes for
orchestration, which streamlined the entire lifecycle of financial data pipeline deployments, increasing deployment efficiency and
operational agility by 40%
Monitored and debugged ETL workflows daily to identify and resolve anomalies, improving the reliability of the data pipeline and
ensuring a 15% decrease in data processing errors, contributing to system optimization and overall performance
Collaborated with team, project manager and clients to gather business requirements and streamline data workflows, ensuring alignment
with organizational goals and client needs resulting in a 15% improvement in project delivery time
Data Engineer
Aplus Datalytics
01.2019 - 08.2022
Developed and seamlessly integrated a machine learning pipeline using AWS SageMaker and Azure ML into the existing data
architecture, enhancing predictive analytics accuracy by 15%
Designed and enforced a data governance framework leveraging Collibra and Apache Atlas, improving metadata management and
data quality by 20%
Enhanced the data lake architecture with Amazon Redshift Spectrum and Azure Synapse Analytics, achieving a 30% increase in data
processing efficiency through optimized partitioning and indexing
Implemented a robust multi-cloud data management strategy utilizing AWS RDS, Azure SQL Database, and Google Cloud Spanner,
which increased system resilience and operational uptime by 25%, ensuring high availability and disaster recovery
Translated business requirements into tailored reports and dashboards using Power BI, delivering actionable insights on KPIs such as
click-through rate (increased by 10%), operational efficiency (boosted by 8%), and revenue growth (up by 12%)
Education
Master of Science - Information Systems
Pace University
May 2024
Bachelor of Technology - Electronics and Telecommunications
Market and Tech Advisor at Berkshire Hathaway HomeServices The Preferred RealtyMarket and Tech Advisor at Berkshire Hathaway HomeServices The Preferred Realty