Summary
Overview
Work History
Education
Skills
Mobile Numbers
Personal Information
Timeline

Pinku Purkait

Kolkata

Summary

Having total 6 years of experience in Data Engineering and 10 years of overall experience in various domains including Retail, Telecom, Health Care and Finance. Design, develop and maintenance of bigdata and distributed data pipelines in a highly available configuration. Having expertise on BIG DATA Technology like Databricks, Pyspark, Hadoop, Hive, Spark SqL, Sqoop etc. Experienced in implementing ETL jobs using Sqoop from RDBMS to HDFS and vice versa. Experience in analyzing data using HiveQL. Good knowledge at using Spark APIs to cleanse, explore, aggregate, transform, and store data. Experience on Hadoop clusters using major Hadoop distributions like Cloudera 5.14. An effective team player with good interpersonal, analytical and client serving abilities.

Overview

10
10
years of professional experience
3
3
Languages

Work History

Data Engineer

Larsen & Toubro Infotech Limited (LTIMindtree)
03.2022 - Current
  • Company Overview: Data warehousing and Unity Catalog migration for a one of the largest Pharmaceutical and Bio Tech
  • Company in US (India, offshore)
  • Ensuring the data pipeline is up and running
  • Directly engaged with various stakeholders on regular basis who consumes the data
  • Ensuring data quality as per user’s requirement
  • Debugging and fixing the issues we face while working on our daily tasks
  • Analyzing the root cause and take necessary action to fix the issue
  • Job/Workflow scheduling and monitoring through Airflow web UI
  • Handling data coming from different source system to a single Hadoop cluster which is used as integrated data warehouse and apply further transformation to enable the business user’s requirement
  • Analyzing the issues and finding out the work arounds and capture them in confluence page /knowledge base
  • Data warehousing and Unity Catalog migration for a one of the largest Pharmaceutical and Bio Tech
  • Company in US (India, offshore)
  • Technology Used: Databricks, Unity Catalog, PySpark, Spark SQL, SQL, Airflow Bitbucket, Github, Confluence etc

Data Engineer, Data Analyst

Tata Consultancy Services Ltd
04.2015 - 03.2022
  • Company Overview: Data Migration for US’s largest Telecom Service Provider (India, offshore)
  • Ensuring data availability as per the design and requirement
  • Communicating with various stakeholder who consumes the data we processed
  • Ensuring data quality as per end-users requirements
  • Implementing end-to-end Hadoop Infrastructure using HDFS, HIVE, Oozie, Spark
  • Debugging and fixing the issues we faced while working on our daily tasks
  • Analyzing the root cause and take necessary action to be fixed the issue
  • Job/Workflow scheduling and monitoring through Oozie
  • Worked in complete Software Development Life cycle (analysis, development, testing, implementation and support) using agile methodologies
  • Migrating data from different Hadoop cluster to a single Hadoop cluster which is used as integrated data warehouse for the purpose of analysis and insights for various consumers
  • Analyzed problems and implemented solutions, ensuring solutions were captured in knowledge base
  • Data Migration for US’s largest Telecom Service Provider (India, offshore)
  • Technology Used: HDFS, HIVE, OOZIE, SQL, SPARK

Data Engineer

Tata Consultancy Services Ltd
04.2015 - 03.2022
  • Company Overview: Data Migration for UK’s largest Telecom Service Provider (India, offshore)
  • Implementing end-to-end Hadoop Infrastructure using HIVE, Sqoop, Oozie, Spark
  • Using Sqoop as data ingestion tool to import data from Relational Database to HDFS
  • Job/Workflow scheduling and monitoring through Oozie
  • Designing both time driven and data driven automated workflow using Oozie
  • Worked in complete Software Development Life cycle (analysis, design, development, testing, implementation and support) using agile methodologies
  • Experience on Hadoop Cluster using Cloudera (CDH5)
  • Migration from different databases (i.e
  • Oracle, MYSQL) to Hadoop
  • Developing Hadoop Applications and recommending the right solutions and technologies for the applications
  • Analyzed problems and implemented solutions, ensuring solutions were captured in knowledge base
  • Data Migration for UK’s largest Telecom Service Provider (India, offshore)
  • Technology Used: HDFS, HIVE, SQOOP, OOZIE, SQL, SPARK

Data Analyst

Tata Consultancy Services Ltd
04.2015 - 03.2022
  • Company Overview: Analytics and Data services for consumer, financial & property data in US (India, offshore)
  • The vital part of the jobs is ETL (Extract, Transform and Loading), we need to extract data from database and transform it and then load to the database with concrete and integrated data as per business requirement
  • Developing mapping function (one to one and one to many) to use field level transformation and applied complex logic to extract useful information from random datasets)
  • Conduct data mining, data modelling and data cleaning in coordination with real property information of USA real estate market
  • Identifying missing data sets which is valid and important for specific table for a database
  • Generating report in various form like simple statistical, Field level Data Difference Report, abnormal data population report etc
  • Interacting with various sources from where data being collected or prepped
  • Collaborated with cross functional teams to analyze, investigate and diagnosis root cause of problems, as well as completion of corrective actions
  • Engaged at a basic technical level in discussions to evaluate those solutions and publish Root Cause Analysis (RCA) report
  • Analytics and Data services for consumer, financial & property data in US (India, offshore)
  • Technology Used: SQL, Access, Excel, Customized ETL Tool

Education

Bachelor of Commerce (B.COM) - Commerce

University of Calcutta

Skills

Databricks

Mobile Numbers

  • 9748757087
  • 9831858327

Personal Information

  • Passport Number: N2816471
  • Passport Date Of Issue: 03/09/15
  • Passport Expiry Date: 02/09/25
  • Father's Name: Bimal Purkait
  • Date of Birth: 04/14/87
  • Marital Status: Single

Timeline

Data Engineer - Larsen & Toubro Infotech Limited (LTIMindtree)
03.2022 - Current
Data Engineer, Data Analyst - Tata Consultancy Services Ltd
04.2015 - 03.2022
Data Engineer - Tata Consultancy Services Ltd
04.2015 - 03.2022
Data Analyst - Tata Consultancy Services Ltd
04.2015 - 03.2022
University of Calcutta - Bachelor of Commerce (B.COM), Commerce
Pinku Purkait