Summary
Overview
Work History
Education
Skills
Certification
Personal Information
Current Address
Timeline
Generic

Mangesh Chikane

Associate Data Architect
Pimpri chinchwad

Summary

Cloud and Big Data Developer with 6+ years of extensive technical experience in data-related technologies. Proficient in Python, SQL, and Big Data Development, with a strong background in data migration and cloud services (GCP/AWS). Adept at working in Agile software development environments. Passionate about handling structured and semi-structured data, consistently delivering high performance. Skilled in GCP and AWS cloud services, with hands-on experience in Sqoop, HDFS, NiFi, Spark, Hive, MapReduce, Oozie, HBase, and Airflow

Overview

7
7
years of professional experience
3
3
Certifications
3
3
Languages

Work History

Associate Technical Architect-Data

Quantiphi Analytics pvt.ltd
08.2022 - Current

Project: Healthcare Data Migration

Technologies:

  • GCP Services: BigQuery, BQMS, DataFlow (Apache Beam), Composer
  • Programming Languages: BigQuery SQL, Python, Shell script

Description: Migrated healthcare data from Epic, Cerner, and HL7 systems to GCP for enhanced data analysis and processing.

Responsibilities:

  • Designed and implemented data ingestion pipelines from Epic, Cerner, and HL7 to GCP.
  • Migrated DataStage and DMX processes to GCP for efficient data extraction and transformation.
  • Configured and managed data storage with BigQuery and Google Cloud Storage (GCS).
  • Automated data transformation and loading using Airflow and Python.
  • Ensured data integrity and consistency during migration.
  • Collaborated with stakeholders for data requirements and accurate data mapping.
  • Conducted testing, debugging, and optimization of data pipelines.
  • Provided support and maintenance for deployed solutions, ensuring timely data availability.

Project: Data Migration

Industry: Education
Technologies: GCP (Dataproc, DataFlow, BigQuery, Composer, Kubernetes Engine), BigData (Sqoop, Spark, SQL), Python, Shell Script

Migrated an educational client's data from MS-SQL to BigQuery on GCP to analyze grades, classroom activities, and counseling call recordings for student progress assessment and retention improvement.

  • Implemented and automated a data ingestion pipeline: MS-SQL → Sqoop → GCS → Spark → BigQuery.
  • Developed Sqoop jobs for weekly incremental ingestion of 150 MS-SQL tables into BigQuery.
  • Automated schema conversion and data mapping from MS-SQL to BigQuery using Python scripts and Spark.
  • Collaborated with ML team to extract meaningful insights from data.
  • Handled development, testing, code reviews, and delivery of various project iterations.
  • Scheduled jobs in Cloud Composer for Sqoop, BigQuery, and ML processes.

Project: Cloudera to Google Databricks Migration

Technologies: GCP, Python, Scala, Databricks, Spark, Airflow

Data Flow:

  • FTP/SQL Server/Oracle → GCS → Databricks → BigQuery

Roles and Responsibilities:

  • Migrated Spark Scala code from Cloudera to Google Databricks
  • Modified Spark code for new source and target systems
  • Developed Scala/Spark jobs for data transformation and aggregation
  • Developed unit test cases for Spark transformations
  • Designed data processing pipelines and Scheduled jobs using Airflow

Senior Data Engineer

Quantiphi Analytics
07.2017 - Current

Project: Data Migration

Industry: Insurance

Technologies: GCP, Python, DataFlow (Apache Beam), BigQuery, Data Fusion, Google Map API, Cloud Composer, Jupyter Notebook

Overview:

  • Migrated Liberty Mutual Insurance’s system to a Big Data platform on GCP.
  • Implemented data pipelines for ingestion, enhancement, and quality assurance.

Key Pipelines:

  • Data Ingestion: Oracle → GCS → Dataflow/Data Fusion → BigQuery
  • Enhancement: BigQuery → Dataflow → Google Places API → BigQuery
  • Data Quality: BigQuery → Dataflow (TensorFlow validation) → BigQuery
  • SQL Conversion: GCS → Dataflow → BigQuery

Responsibilities:

  • Loaded data to BigQuery.
  • Developed Google Places pipeline.
  • Converted Oracle schema to BigQuery schema.
  • Used Data Catalog API for tagging.
  • Built ingestion pipelines with Data Fusion.
  • Transformed SQL queries to Dataflow jobs.
  • Ensured data quality through rigorous testing, validation, and monitoring of all data assets, minimizing inaccuracies and inconsistencies.
  • Reengineered existing ETL workflows to improve performance by identifying bottlenecks and optimizing code accordingly.
  • Championed the adoption of agile methodologies within the team, resulting in faster delivery times and increased collaboration among team members.
  • Participated in strategic planning sessions with stakeholders to assess business needs related to data engineering initiatives.


Project: Data Lake Ingestion Framework

Industry: Insurance Tech

Technologies: Apache NiFi 1.2, Python 2.7, Java 1.7, AWS Cloud (NiFi, HDP infrastructure), GitHub, Jenkins, Bamboo


  • Developed and enhanced a Big Data project for Liberty Mutual Insurance on the Hadoop platform, integrating cloud capabilities.
  • Managed data flow from various sources (RDBMS/File/AWS RDS) through NiFi/Groovy Script/Shell to S3 and further processing with Hive/Spark for Web UI Portals.
  • Led a 6-member Enhancement Team, designing and implementing a NiFi-based ingestion framework.
  • Key implementations:Schema evolution with AvroType 2 data handling Logging (start/end time, data count etc) Retry logic
  • Automated NiFi template deployment using Python framework.
  • Conducted enhancement, testing, code review, and delivery for modules such as Subrogation and RF Home.
  • Analyzed data and existing code to create mapping documents for various modules.
  • Developed Linux scripts for job automation.
  • Managed creation, insertion, data analysis, and testing of Hive tables.
  • Successfully deployed modules to production environment.

Education

Master in Computer Science -

Pune University
Pune
04.2001 -

Bachelor in Computer Science -

Pune University
Pune
06.2014

Skills

  • Google Cloud Platform(GCP)

  • Amazon Web Services(AWS)

  • Python

  • Scala,Java

  • Apache Spark

  • Airflow

  • GCP tech stack: BigQuery, Dataflow,Dataproc

  • Build Automation Tools: Maven, Gradle, SBT

Certification

Google Cloud Certified - Professional Cloud Architect

Personal Information

  • Passport Number: P8149077
  • Date of Birth: 07/29/93
  • Marital Status: Married

Current Address

Pimpri chinchwad, Pune, 411062

Timeline

Associate Technical Architect-Data

Quantiphi Analytics pvt.ltd
08.2022 - Current

Senior Data Engineer

Quantiphi Analytics
07.2017 - Current

Master in Computer Science -

Pune University
04.2001 -

Bachelor in Computer Science -

Pune University
Mangesh ChikaneAssociate Data Architect