Summary
Overview
Work History
Education
Skills
Languages
Timeline
Generic
Earam Irfan

Earam Irfan

Bengaluru

Summary

Dynamic Cloud Data Platform Engineer with a proven track record at Allstate, specializing in PySpark and Kubernetes. Expert in optimizing data pipelines and enhancing performance, achieving a 50% increase in operational efficiency. Adept at collaborating with stakeholders and delivering tailored architectural solutions, while fostering team engagement through knowledge-sharing initiatives.

Overview

6
6
years of professional experience

Work History

Cloud Data Platform Engineer (Consultant)

Allstate
12.2024 - Current
  • Leading a Proof of Concept (PoC) initiative for Microsoft Fabric, analyzing platform capabilities including Onelake, Lakehouse, and Fabric Notebooks, to assess feasibility for enterprise data workloads.
  • Collaborating with stakeholders and DPTs to gather workload requirements, provide tailored architectural recommendations, and support the migration strategy from Spark on Kubernetes to Microsoft Fabric.
  • Designing and benchmarking performance tests between Spark-on-Kubernetes and Microsoft Fabric to evaluate trade-offs in scalability, cost, developer productivity, and integration with Microsoft ecosystem.
  • Developing knowledge-sharing content and documentation to accelerate team onboarding to Microsoft Fabric, including architecture diagrams, runbooks, and migration playbooks.
  • Conducting weekly demo sessions to educate Data Product Teams (DPTs) and drive engagement with new platform capabilities and best practices.

Big Data Developer

Allstate
02.2022 - 11.2024
  • Designed and developed PySpark applications to process and analyze large-scale datasets.
  • Implemented ETL pipelines using PySpark to extract, transform, and load data from various sources into data warehouses, ensuring data consistency and accuracy.
  • Migrated user applications from legacy Hadoop infrastructure to Kubernetes, improving performance, scalability, and operational efficiency by 50%.
  • Developed and maintained onboarding processes, creating documentation, guidelines, and dashboards to facilitate seamless user adoption of compute and storage platforms.
  • Managed and optimized platforms for over 3,000 big data users across diverse compute and storage environments, including on-prem S3, AWS, Dremio, and Hadoop.
  • Implemented Jenkins pipelines for building and pushing images to Artifactory in a Kubernetes environment, reducing manual intervention and speeding up deployments.
  • Provided consultancy and support to users, optimizing data pipelines and workloads on Dremio, Hadoop, Kubernetes, and AWS to ensure performance and cost-efficiency.
  • Designed and executed Smoke tests for CaaS, S3, Dremio, and Hadoop platforms, ensuring platform reliability and optimal configuration through Python-based automation.

Big Data Engineer

Accenture
08.2019 - 02.2022
  • This project involves processing huge semi-structured/structured data and loading then to Hive tables.
  • Data received from various sources for the application, like Business Insurance/Bond/policy, and Salesforce.
  • Extensive analytical skills to interpret customers’ business needs and translate them into functional specifications.
  • Practical experience in Big Data Technologies like Hortonworks-Hadoop, HIVE, Sqoop, Spark& HDFS, as well as other data warehousing tools like Teradata, Oracle SQL & MySQL.
  • Excellence in working on programming languages such as Python, Spark, Linux & Unix Shell Scripting.
  • Worked on Data Cleaning and helped with the solution for Data issues that occurred.
  • End-to-end testing and flow creation for the ingestion process.
  • Creating Auto system jobs for the upgradation and decommissioning of the project.
  • Involved in migrating all Hadoop jobs to AWS cloud services.

Education

Executive PG - Business Analytics

LIBA
Chennai
11.2024

B.E. - EEE

Dayananda Sagar College of Engineering
Bangalore
08.2019

Skills

  • Programming Skills: Python, SQL, PySpark, Shell Scripting
  • Platform & Tools: PyCharm, Visual Studio Code, Jupyter Notebook, Azure Databricks, Git/Github, WinScp, Bash, CI/CD
  • Monitoring: Datadog, Splunk, ServiceNow
  • Big Data & ETL: Apache Spark, Hadoop, Hive, Dremio, SQL, Spark streaming
  • Cloud & DevOps: AWS (S3, Athena, EKS), Microsoft Fabric, Jenkins, Docker, Kubernetes

Languages

English
First Language
Hindi
Proficient (C2)
C2

Timeline

Cloud Data Platform Engineer (Consultant)

Allstate
12.2024 - Current

Big Data Developer

Allstate
02.2022 - 11.2024

Big Data Engineer

Accenture
08.2019 - 02.2022

Executive PG - Business Analytics

LIBA

B.E. - EEE

Dayananda Sagar College of Engineering
Earam Irfan