A good team player with strong ability to lead and adapt new skills and willing to learn things on the fly
Data Scientist and Azure cloud data Engineer with over all 9 years of IT professional experience, familiar with gathering, cleaning and organizing data and predictive modeling,Classification,Cluster model,data mining,Natural Language Processing,Recommendation Model Implementation,Artificial Intelligence, BigData,Advanced understanding of statistical, analytical techniques and Ingestion frame works and creating Pipelines,CICD deployments with ansibles, Highly organized, motivated and diligent with significant background in Data Science and Azure Cloud.
Data Science/Machine Learning :
Implemented Natural Language Process the Intent classification model to classify Service Request or Incidents using the email and Entity Recognition Model to extract the required entities frome the email request
Implemented the content-based recommender system to recommend the related deals to the user using Text Analytics
Implemented the Classification model to predict Order/Product probability of Order cancellation/Sales Order return
Worked on Data cleansing, Data Preprocessing, Data preparation from relational & non-relational data bases
Performed Exploratory Data Analysis (EDA) to extract key insights from the data
Created Machine Learning Models using Azure Machine Learning Framework, Integrated the models using Web services
Grouped/segregated the deals industry wise into clusters using K-means clustering algorithm
Performed Text Analytics/Text mining with tokenization, stop words removal, Corpus, tf-idf, Count Vectorization for Bag of word approach using NLTK libraries
Used the cosine similarities between the tf-idf matrixes to identify the related content with in the deals
Used K-Fold Cross validation for Model selection and Grid Search for Optimization
Azure Data Bricks Development :
Actively participated in Requirement analysis, Designing Mapping Toolkit to build dimension and fact tables for data modelling.
Implemented PySpark data bricks notebooks to ingest and transform the data from source(raw) to target(output) container by applying business logics.
Created external hive tables with partitions on target Parquet and Delta file format folders.
Implemented code reusability technique’s by creating separate notebooks for variable initialization and Pyspark function, calling them in each dimension and fact table notebooks execution.
Each dimension and fact table data ingestion handled using two notebooks, one is to read and transform the input data and another one is to implement the data merge operation using delta logic (SCD Type1).
Implemented logging mechanism in data bricks notebooks to log each performed activity using try and except blocks while running the notebook.
Extensively worked on different files formats like .csv, parquet and delta file formats.
Implemented performance optimization techniques in data frame read/write operation by creating partitions in external hive table.
Azure Data Factory Development :
Implemented ADF pipelines for data-driven workflows in cloud for orchestrating and automating data movement and data transformation.
Implemented Event and Scheduled basis triggers to extract the files from Azure Blob Storage to ADLS Gen2 Storage.
Created Pipelines to extract the data using Copy Data activity and transformed it using Data bricks Notebook activity and finally loaded it into SQL table using Copy Data activity.
Implemented pipelines according to business rules by using different ADF activities like IF Condition, For Each, Get Metadata, Set Variable, Filter and Switch..etc
Invoked child Pipelines run from master Pipeline using Execute Pipeline activity by passing master Pipeline parameters.
Used Until and Wait Activities to keep another Pipeline run in queue while current Pipeline run is In Progress.
Created Azure Logic App resource to post the request using web activity in ADF to send automated emails to business/project resources to acknowledge Pipeline Success/Failures.
Created Pipelines to load the data from ADLS Gen2 folders with parquet/delta file formats into SQL database tables.
Extensively used Azure DevOps process to create repositories to save all the codes related to data bricks, data factory, logic app and SQL tables/views.
Implemented different CI/CD Pipelines to publish individual code related changes to their data bricks/data factory release folders.
Business Intelligence Developer :
Worked on the Accelerated Work Items and Migrated servers to SQL Server 2016 from older versions
Created the target builds for VSO check-in and applied builds, worked on automation to backup, copy and restore databases, Test scenarios creation and Performed Single Box Development Environment at offshore level, Worked on the job issue fixes (MS batch) and ad-hoc requests and Taking care of space issues due to database file growth and Creating linked servers
Installing and configuring SQL server and Patching the servers with service packs.
Monitoring the processes using activity monitor and troubleshooting the blocking processes
Developed, implemented, supported and maintained SSIS package workflows.
Implemented BI solution framework for end-to-end business intelligence projects.
Worked with various Control Flow items (For Loop, For Each loop, Sequence, Execute Sql task, File System task) and Data Flow items (Flat File and OLE DB source and Destinations).
Designing and developing SSIS packages using different transformations like Conditional Split, Multicast, Union-All, Merge, Merge Join and Derived Column.
Different types of Data Loading (Direct, Incremental, SCDType2 etc...) implemented using packages and used configuration files. Prepared Unit Test Specification Requirements.
Designing SSIS packages to Move the fine-tuned data from various heterogeneous data source and flat files, Excel sheets to the Data Marts and staging areas.
Worked on various tasks of SSIS like Transform Data Task, Execute SQL Task, Bulk Insert, for each loop, Sequence Container and file System Task.
Used various report items like tables, sub report and charts in SSRS and upload into Report Manager
SSIS packages to production servers and file system as per requirements and Monitoring day to day data loads of various SSIS Package
Data Science, Python, Machine Learning, Data Bricks, Pyspark,SparkSQL,Azure Data Factory,Data Lake,R,Tableau,PowerBI & SQL Server
AZ-900 Microsoft Azure Fundamentals
A good team player with strong ability to lead and adapt new skills and willing to learn things on the fly
Participated in AI Hackathon conducted by Accenture
Actively involved in CSR activities and initiatives organized by Accenture
Regularly conduct teach sessions on different modules of Data Science and Machine learning and help the team members to understand the concepts of machine learning
AZ-900 Microsoft Azure Fundamentals
774 - Perform Cloud Data Science with Azure Machine Learning