Sandeep Sai

Azure Data Engineer

HCL Technologies

10.2020 - Current

Project Title: Data Integration and Analysis for Apple Green Company's Fast Food Chains.

Project Overview: Apple Green Company operates several fast-food chains including Burger King, KFC, and Pizza Hut across the US and UK. The project aims to collect, integrate, and analyze data from these locations to provide valuable insights to the client for better decision-making and operational efficiency.

Roles & Responsibilities:

Design and develop data pipelines to collect and integrate data from Burger King, KFC, and Pizza Hut locations into Azure.
Implement data cleaning and transformation processes to ensure data quality and consistency.
Work with Azure services to store and manage the collected data securely.
Optimize data processing and storage for performance and scalability.
Collaborate with stakeholders to understand analytical requirements and translate them into technical solutions.
Develop and deploy analytical models and algorithms to analyze the integrated data and extract insights.
Create interactive dashboards and reports using Azure Power BI to visualize and communicate the findings to stakeholders.
Monitor and maintain the data pipelines and analytical solutions to ensure they meet performance and reliability requirements.
Provide support and training to end-users on using the analytical solutions effectively.

Project Title: Data Integration and Analysis for Apple Green Company's Fast Food Chains.

Project Overview: Apple Green Company operates several fast-food chains including Burger King, KFC, and Pizza Hut across the US and UK. The project aims to collect, integrate, and analyze data from these locations to provide valuable insights to the client for better decision-making and operational efficiency.

Roles & Responsibilities:

Experience in creating complex stored procedures & performance tuning of SQL queries.
Developed Azure data factory Pipelines for moving data from staging to Datawarehouse using incremental data load process.
Azure DataBricks used for performing transformations on the data & ADF pipelines are used to call the data bricks jobs.
Pyspark has been used to carry out data bricks related jobs & involved consuming parquet files generated by AKS jobs.
Designed Azure ADF pipelines using lift & shift, Implemented ADF, SSIS packages designing and development to load data from sources to Azure database.
Designed Azure ADF pipelines to move data from 6 diff sources to Azure Data Lake Gen2 and then to Azure Data Warehouse.
Implemented activities Copy activity, Execute Pipeline, Get Meta data, If Condition, Lookup, Set Variable, Filter, For Each pipeline Activities for On-cloud ETL processing.
Primarily involved in Data Migration using SQL, SQL Azure, Azure Data Lake and Azure Data Factory.
Professional in creating a data warehouse, design-related extraction, loading data functions, testing designs, data modeling, and ensure the smooth running of applications.
Responsible for extracting the data from OLTP and OLAP using Azure Data factory and Databricks to Data Lake.
Developed pipelines that can extract data from various sources and merge into single source datasets in Data Lake using Databricks.
Creating the linked service for source and target connectivity Based on the requirement.
Once it’s created pipelines and datasets will be triggered based on LOAD (HISTORY/DELTA) operations.
Created Mount point for Data lake and extracted Different formats of data like CSV and Parquet.
Created data frames and transformed DF’s using Pyspark.
Loading data from on premise to Data lake and to Azure SQL tables using SSIS and Azure data factory.
Extracted Data from CSV, Excel and SQL server sources to Staging tables Dynamically using ADF pipelines

Data Engineer

Accenture

09.2018 - 08.2020

Software Engineer

EXL Services

01.2018 - 09.2018

Leveraged text, charts and graphs to communicate findings in understandable format.
Analyzed large amounts of data to identify trends and find patterns, signals and hidden stories within data.
Assessed large datasets, drew valid inferences and prepared insights in narrative or visual forms.
Identified, reviewed and evaluated data management metrics to recommend ways to strengthen data across enterprise.
Led recruitment and development of strategic alliances to maximize utilization of existing talent and capabilities.
Aggregated and cleaned data from TransUnion on thousands of customers' credit attributes
Performed missing value imputation using population median, check population distribution for numerical and categorical variables to screen outliers and ensure data quality
Leveraged binning algorithm to calculate the information value of each individual attribute to evaluate the separation strength for the target variable
Checked variable multicollinearity by calculating VIF across predictors
Built logistic regression model to predict the probability of default; used stepwise selection method to select model variables
• Tested multiple models by switching variables and selected the best model using performance metrics including KS, ROC, and Somer’s D

Summary

Overview

Work History

Azure Data Engineer

Data Engineer

Software Engineer

Education

Master of Engineering -

B.Tech -

Skills

Certification

Timeline

Azure Data Engineer

Data Engineer

Software Engineer

Master of Engineering -

B.Tech -

Similar Profiles

Srinivasan NandagopalSrinivasan Nandagopal

Pooja SaxenaPooja Saxena

GIRISH KORIGIRISH KORI

Abhishek KumarAbhishek Kumar

Sabir PashaSabir Pasha