Summary
Overview
Work History
Education
Skills
Certification
ACHIEVEMENTS
Timeline
Generic

AMIT TIKHE

Pune

Summary

Data Engineer with 3+ years of experience , adept at designing and optimizing ETL processes. Proficient in SQL, PySpark, AWS services, and the Hadoop ecosystem . Skilled in data warehousing and managing data pipelines. Demonstrated ability in performance tuning and distributed data processing using Spark. Collaborative team player, currently enhancing data solutions at a Tata Consultancy Services.

Overview

3
3
years of professional experience
1
1
Certification

Work History

System Engineer

Tata Consultancy Services
04.2021 - Current
  • Led team in efficiently handling large-scale data solutions.
  • Addressed and resolved complex data processing challenges, focusing on performance optimization and scalability.
  • Provided technical leadership and mentorship to junior team members, fostering skill development and professional growth.
  • Engaged in impact analysis and estimated budgets for ETL components across multiple upcoming projects.


Project 1: Cornerstone Data Migration

Developed a Python/PySpark solution for API data extraction to AWS S3 , with transformation and Snowflake integration.


Roles and Responsibilities:

  • Leveraged PySpark for orchestrating data pipelines, facilitating efficient data exchange between API endpoints and distributed storage solutions, including AWS S3 and Snowflake .
  • Executed large-scale data analysis and transformation tasks using Spark SQL and Spark's data processing capabilities.
  • Managed incremental data operations including inserts, updates, and deletions sourced from APIs.
  • Implemented parallel processing using PySpark to efficiently handle large-scale data ingestion from APIs .
  • Streamlined daily data updates and workflow scheduling through automation capabilities of Airflow .


Project 2: Canadian Profitability

Streamlined processing of an offline MS Access Database by developing automation scripts in PySpark. Resulted in significant savings of 1400 labour hours annually . Data generated was utilized by QlikView applications to enhance data visualization.


Roles and Responsibilities:

  • Analyzed customer business requirements to prepare data mappings , evaluating data flow, transformation needs, and data fixes.
  • Authored Spark and Hive jobs to efficiently extract records from multiple downstream sources.
  • Managed export and import of batch and delta data into HDFS, HBase, and Hive utilizing PySpark.
  • Monitored job executions, performed debugging, and resolved bugs to ensure smooth operations.
  • Implemented job automation solutions using shell scripting and Airflow for enhanced efficiency.


Project 3: HRDW Re-platform

Successfully executed involving ETL processes using IBM DataStage. Transformed and stored data in a Data Warehouse , making it accessible to reporting applications such as Cognos, QlikView, QlikSense, and Power BI.


Roles and Responsibilities:

  • Orchestrated analysis of upstream and downstream data flows and business requirements .
  • Utilized IBM Data Stage to implement robust ETL processes, facilitating efficient data import/export operations within data warehouse environment.
  • Performed data loads, and conducted extensive SQL-based data validation and quality assurance to uphold data integrity.
  • Crafted and automated data extraction procedures using Linux shell scripting , optimizing system performance through strategic Cron-job scheduling .

Education

Master of Science - Mathematics And Computer Science

Govt Vidarbha Institute of Science & Humanity
Amravati, India

Bachelor of Science - Computer Science

Govt Vidarbha Institute of Science & Humanity
Amravati, India

Skills

  • Big Data : Hadoop (HDFS, YARN), MapReduce, Hive, Sqoop, Spark
  • Programming Languages : Python, Hive QL, and Shell Scripting
  • Database : Oracle, DB2, PostgreSQL, Snowflake, MySQL
  • Cloud Service : AWS, EC2, S3, EMR, RDS, DynamoDB, Redshift, and Athena
  • Version Control : GitHub
  • Reporting : Power BI, Cognos

Certification

AWS Cloud Practitioner Certification - AWS

Credential Id: AbqZUesXPLsFAt5NkujoFXXJjSXJ

ACHIEVEMENTS

On The Spot Team Award

Best Team Award

Timeline

System Engineer

Tata Consultancy Services
04.2021 - Current

Master of Science - Mathematics And Computer Science

Govt Vidarbha Institute of Science & Humanity

Bachelor of Science - Computer Science

Govt Vidarbha Institute of Science & Humanity
AMIT TIKHE