Summary
Overview
Work History
Education
Skills
Certification
Activities
Timeline
Generic

Arko Chakraborty

Azure Data Engineer, Data Analyst
Noida

Summary

Passionate data enthusiast with expertise in Python, PySpark, and SQL scripting language for data analysis and manipulation. Strong foundation in machine learning and statistics. Proficient in Azure Data Factory and Azure Databricks, leveraging cloud data environments in Azure. Seeking opportunities to contribute to data-driven decision-making and drive efficiency and success in global businesses.

Overview

9
9
years of professional experience
6
6
years of post-secondary education
4
4
Certifications

Work History

Data Engineer (Band 7B)

IBM India
05.2022 - Current

1. I am responsible for implementing data pipelines in a dynamic fashion.

2. Understanding the business requirements and accordingly develop data models and implement the tables or views.

3. Helped design the data landscape architecture for multiple projects.

4. Worked on Azure Synapse data warehouse. Optimized query written on On-Prem system to work in distributed Data Warehouse and helped to speed them up on an average by 93 percent.

5. Implemented a solution to go through ADLS using Python SDK and automatically create the entire metadata table, which was the foundation of the entire data pipeline, and completed the load of over 100 tables in just a matter of two days.

6. Enabled client before completion of project by giving a demo and also tutoring them on how to write optimized queries for their future use case on Synapse.

7. Have been rated as exceptional twice in a row for performance review.

8. Helped IBM to get a client by doing a solo POC on a local machine on a data science and data analysis project using Python code. Demonstrated to client in person with great success.

9. Got certified as an Azure data engineer Associate, as well as a Databricks Associate Data Engineer.

Senior Data Engineer

Coforge Ltd.
8 2020 - 05.2022
  • Worked with a banking domain client
  • Helped in moving their On-Prem systems and logic to Azure based solution
  • This involved Understanding business requirement and accordingly plan a solution to deploy on Azure data platform
  • Analyzing data using complex SQL queries
  • Designing of Data pipelines in ADF for data validation, data transformation and data loading
  • Creating reports to help client gain insights using Power BI
  • Implement large scale transformations using PySpark programming
  • Created alternate, efficient logic for processing their transaction data which reduced hard code, created easily manageable reference tables for future modifications, encompassed many unique scenarios into lesser number of pipelines and codes using full power of parallel processing of spark and ADF to achieve a high-speed complex transformation of over millions of transactions per day in matter of few minutes which in python otherwise took hours daily
  • This task involved extensive data analysis, data correction, fault checks, scenario creations, flexible yet robust codes, and extensive tests and deliberations along with team as well as clients on daily basis
  • This feat obviously involved team-work, but I played the pivotal role of leading the development, logical deductions and breaking the entire complexity into small chunks of easily understandable and executable logic
  • (The complexity in one case involved mapping over 300000 rows of one single transaction among each other trying to find the appropriate debit and credit and check sum to be zero with logic expanding over to multiple tables of accounts, products and channels
  • This is just one of many complex problems which were handled in Pyspark)
  • In the previous client systems their logic led to multiple mappings of same rows leading to duplicate amounts and inflated amount figure which they later manually handled
  • Also, they used MATCH_RECOGNISE command of oracle which is unavailable in python and pyspark
  • And due to this the alternate logic was created which went on to become much accurate and robust than their previous implementation
  • Exploring solutions for further phases of development which involves learning and training.
  • Collaborated with cross-functional teams to define requirements and develop end-to-end solutions for complex data engineering projects.
  • Delivered exceptional results under tight deadlines, consistently prioritizing tasks effectively to meet project timelines without compromising quality or accuracy.
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Ensured data quality through rigorous testing, validation, and monitoring of all data assets, minimizing inaccuracies and inconsistencies.
  • Reengineered existing ETL workflows to improve performance by identifying bottlenecks and optimizing code accordingly.

Instrumentation Manager

RIL Jamnagar
08.2015 - 07.2018
  • Responsible for smooth operations and maintenance of instruments in plant
  • Data Analysis of information gathered by instruments in critical process circuits and proactively applying possible fixes or intimating the operations team to take necessary action
  • Ensure all safety instruments, automatic safety measures and logic are healthy and functional
  • Install CEMS analyser to ensure the emission is well within limit permitted
  • Predict maintenance requirement of instruments and amendments to trip logic
  • Plan shutdowns, inventory, and manpower requirement, and ensure smooth execution.

Education

MBA (Data Science) -

Neemrana UNIVERSITY(NIIT)
01.2019 - 04.2021

B.Tech in Instrumentation and control engineering -

Manipal Institute of Technology
Manipal, Udupi, Karnataka
04.2011 - 04.2015

Skills

Strong Fundamental of Statistics

Certification

AZ-900

Activities

I enjoy reading about emerging technologies and about new algorithms in terms of data science and machine learning. I also enjoy solving puzzles using python. I aim to keep myself updated with latest developments. 

Timeline

Data Engineer (Band 7B)

IBM India
05.2022 - Current

MBA (Data Science) -

Neemrana UNIVERSITY(NIIT)
01.2019 - 04.2021

Instrumentation Manager

RIL Jamnagar
08.2015 - 07.2018

B.Tech in Instrumentation and control engineering -

Manipal Institute of Technology
04.2011 - 04.2015

Senior Data Engineer

Coforge Ltd.
8 2020 - 05.2022
Arko ChakrabortyAzure Data Engineer, Data Analyst