As a recent graduate with a passion for data engineering. I have hands-on experience working with tools such as Azure SQL, Databricks, and Azure Data Factory. Additionally, I have a strong foundation in Python, SQL enabling me to contribute effectively to building scalable data solutions. I am excited to start my career and leverage my skills to solve complex data challenges.
Executed data extraction from PDFs using Azure AI Form Recognizer, performed insightful exploratory data analysis in retail domain, and conducted theoretical research on Zero-Shot Text Classification.
Technologies used: Python, Pandas, Matplotlib, Seaborn, Azure Blob Storage, Azure AI Form Recognizer.
Python for Data Analytics
SQL
Power BI
MongoDB
Azure Services
Jupyter Notebook
DataBricks
Machine Language
Java Programming
Web Technologies
Python
C Programming
PHP6
Data Mining
Data Structure and Algorithms
Figma, Canva
· Coursera Certification - Data Analysis with Python from IBM
· Microsoft Certified: Azure Fundamentals (AZ900)
Historical Data Ingestion and Transformation using Azure Data Factory (ADF)
This project focuses on ingesting and processing both historical and real-time data for an e-commerce platform. Historical data is ingested from Azure Blob Storage. ADF pipelines orchestrate the movement and transformation of this data, automating tasks like copying Parquet files to Azure SQL and Cosmos DB.
Technology Used – Azure Data Factory, Azure SQL, Cosmos DB, Azure Event Hub, and Databricks for data transformation and storage.
Data Analytics and Prediction for Movies
Analytics to Predict the Type and Language of the movie that a person of a particular age and gender would prefer and would be most interested in.
Technology Used – Python and its Libraries, Support Vector Machine Algorithm.
News Classification with Machine Learning
Implementing a Machine Learning Model on Websites that read the News Headlines or the content of the news and classifies the Category of the News.
Technology Used – Python and its libraries (Pandas and numpy), Multinomial Naïve Bayes algorithm for Classification.
Static Website Hosting Using AWS
Hosting a static website with AWS E2 storage for efficient and easy accessibility of the fully functioning website.
Technology Used – MYSQL, PHP, HTML, CSS, AWS E2 services.
TRANSLATE+
Translate+ is an all-in-one tool for language translation. It incorporates speech recognition, image-text recognition, Translation, and conversion into Various Formats.
Technology Used –Python and Several of its libraries, Open CV, Speech Recognition, Langdetect, GoogleTrans, gTTs, pyQT5.
Heart Disease Prediction Using Logistic Regression
Implementing a Machine Learning Model on the Cleveland Heart Disease Dataset and predicting the presence of heart disease based on various attributes
Technology Used – Python and its libraries (Pandas and sklearn), Logistic Regression for Classification.
Security Essentials: SPAM detection using Machine Language, Prevention and Demonstration of SQL Injection and CSRF attack.
A Website used for testing and securing purpose to show the Implementation of various security strategies. Technology Used – HTML, CSS, nodejs, jquery, python and its libraries, support vector machine and random forest algorithm for spam detection in emails.