Summary

Overview

Work History

Education

Skills

Certification

Activities

Timeline

Arko Chakraborty

Azure Data Engineer, Data Analyst

Noida

Summary

Passionate data enthusiast with expertise in Python, PySpark, and SQL scripting language for data analysis and manipulation. Strong foundation in machine learning and statistics. Proficient in Azure Data Factory and Azure Databricks, leveraging cloud data environments in Azure. Seeking opportunities to contribute to data-driven decision-making and drive efficiency and success in global businesses.

Overview

years of professional experience

years of post-secondary education

Certifications

Work History

Data Engineer (Band 7B)

IBM India

05.2022 - Current

1. I am responsible for implementing data pipelines in a dynamic fashion.

2. Understanding the business requirements and accordingly develop data models and implement the tables or views.

3. Helped design the data landscape architecture for multiple projects.

4. Worked on Azure Synapse data warehouse. Optimized query written on On-Prem system to work in distributed Data Warehouse and helped to speed them up on an average by 93 percent.

5. Implemented a solution to go through ADLS using Python SDK and automatically create the entire metadata table, which was the foundation of the entire data pipeline, and completed the load of over 100 tables in just a matter of two days.

6. Enabled client before completion of project by giving a demo and also tutoring them on how to write optimized queries for their future use case on Synapse.

7. Have been rated as exceptional twice in a row for performance review.

8. Helped IBM to get a client by doing a solo POC on a local machine on a data science and data analysis project using Python code. Demonstrated to client in person with great success.

9. Got certified as an Azure data engineer Associate, as well as a Databricks Associate Data Engineer.

Senior Data Engineer

Coforge Ltd.

8 2020 - 05.2022

Worked with a banking domain client
Helped in moving their On-Prem systems and logic to Azure based solution
This involved Understanding business requirement and accordingly plan a solution to deploy on Azure data platform
Analyzing data using complex SQL queries
Designing of Data pipelines in ADF for data validation, data transformation and data loading
Creating reports to help client gain insights using Power BI
Implement large scale transformations using PySpark programming
Created alternate, efficient logic for processing their transaction data which reduced hard code, created easily manageable reference tables for future modifications, encompassed many unique scenarios into lesser number of pipelines and codes using full power of parallel processing of spark and ADF to achieve a high-speed complex transformation of over millions of transactions per day in matter of few minutes which in python otherwise took hours daily
This task involved extensive data analysis, data correction, fault checks, scenario creations, flexible yet robust codes, and extensive tests and deliberations along with team as well as clients on daily basis
This feat obviously involved team-work, but I played the pivotal role of leading the development, logical deductions and breaking the entire complexity into small chunks of easily understandable and executable logic
(The complexity in one case involved mapping over 300000 rows of one single transaction among each other trying to find the appropriate debit and credit and check sum to be zero with logic expanding over to multiple tables of accounts, products and channels
This is just one of many complex problems which were handled in Pyspark)
In the previous client systems their logic led to multiple mappings of same rows leading to duplicate amounts and inflated amount figure which they later manually handled
Also, they used MATCH_RECOGNISE command of oracle which is unavailable in python and pyspark
And due to this the alternate logic was created which went on to become much accurate and robust than their previous implementation
Exploring solutions for further phases of development which involves learning and training.
Collaborated with cross-functional teams to define requirements and develop end-to-end solutions for complex data engineering projects.
Delivered exceptional results under tight deadlines, consistently prioritizing tasks effectively to meet project timelines without compromising quality or accuracy.
Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
Ensured data quality through rigorous testing, validation, and monitoring of all data assets, minimizing inaccuracies and inconsistencies.
Reengineered existing ETL workflows to improve performance by identifying bottlenecks and optimizing code accordingly.

Instrumentation Manager

RIL Jamnagar

08.2015 - 07.2018

Responsible for smooth operations and maintenance of instruments in plant
Data Analysis of information gathered by instruments in critical process circuits and proactively applying possible fixes or intimating the operations team to take necessary action
Ensure all safety instruments, automatic safety measures and logic are healthy and functional
Install CEMS analyser to ensure the emission is well within limit permitted
Predict maintenance requirement of instruments and amendments to trip logic
Plan shutdowns, inventory, and manpower requirement, and ensure smooth execution.

Education

MBA (Data Science) -

Neemrana UNIVERSITY(NIIT)

01.2019 - 04.2021

B.Tech in Instrumentation and control engineering -

Manipal Institute of Technology

Manipal, Udupi, Karnataka

04.2011 - 04.2015

Skills

Strong Fundamental of Statistics

Machine Learning

Python

Azure Databricks

SQL

PySpark

Azure Data Factory

Azure Synapse Analytics

Data Analysis

ETL development

Data Modeling

Data Warehousing

Performance Tuning

Big Data Processing

Certification

AZ-900

Activities

I enjoy reading about emerging technologies and about new algorithms in terms of data science and machine learning. I also enjoy solving puzzles using python. I aim to keep myself updated with latest developments.

Timeline

Data Engineer (Band 7B)

IBM India

05.2022 - Current

MBA (Data Science) -

Neemrana UNIVERSITY(NIIT)

01.2019 - 04.2021

Instrumentation Manager

RIL Jamnagar

08.2015 - 07.2018

B.Tech in Instrumentation and control engineering -

Manipal Institute of Technology

04.2011 - 04.2015

Senior Data Engineer

Coforge Ltd.

8 2020 - 05.2022

Arko Chakraborty

Summary

Overview

Work History

Data Engineer (Band 7B)

Senior Data Engineer

Instrumentation Manager

Education

MBA (Data Science) -

B.Tech in Instrumentation and control engineering -

Skills

Certification

Activities

Timeline

Data Engineer (Band 7B)

MBA (Data Science) -

Instrumentation Manager

B.Tech in Instrumentation and control engineering -

Senior Data Engineer

Similar Profiles

Sheron AliSheron Ali

Amruth Anil Vannan KandyAmruth Anil Vannan Kandy

Teresiah KimaniTeresiah Kimani

Manoj Kumar BollojuManoj Kumar Bolloju