Product Review Analysis (Trimester 1 Project)
- Developed multiple applications, including Sentiment Analysis on customer reviews.
- Implemented a suggestion system to improve products by distinguishing constructive criticism from non-constructive feedback.
- Created an inventory stacking system based on sales data and customer feedback, tailored to region and season.
Fake News Campaign Detection (Trimester 2 Project) for Social Network Analysis
- Utilized Graph Neural Networks (GNN) to detect the spread of fake news and assess the impact of specific networks in disseminating misinformation.
- Implemented Bidirectional LSTM for keyword detection to identify fake news, contributing to a research paper.
- Graduation with Distinction,
3D indoor map generator using Advance Computers Vision
- Tech Stack: Node.js, OpenCV, WebSockets, Express.js, Three.js, QR Code Scanning, REST API
- Developed a 3D indoor navigation system leveraging advanced computer vision and Node.js to generate dynamic route maps for large complexes such as hospitals, malls, and universities.
- Implemented real-time indoor mapping using OpenCV and Three.js to visualize paths in 3D.
- Enabled users to scan a QR code at entry points to connect to the server and retrieve their current location.
- Designed a location-based routing algorithm to compute and display the optimal path to the selected destination.
- Built a backend with Express.js and WebSockets for live updates and smooth navigation assistance.
- Ensured scalability and cross-platform support for integration in multi-building infrastructures.
Custom Annotation Tool & Automated Model Trainer for YOLO/ONNX Models For Advance Computer Vision
- Tech Stack: Python, OpenCV, Flask, HTML/CSS, JavaScript, LabelImg, YOLOv5/YOLOv8, ONNX, Reinforcement Learning, NumPy, Pandas
- Developed a custom annotation and model training pipeline to streamline dataset preparation and training for advanced computer vision models.
- Built a lightweight annotation tool inspired by Roboflow and LabelImg with a user-friendly HTML interface to upload videos, extract frames (FPS-based), and define dataset split ratios (train/val/test).
- Integrated automated frame extraction using OpenCV and handled dataset organization dynamically through the backend.
- Implemented reinforcement learning-based annotation assistance to reduce manual effort and improve label accuracy.
- Enabled seamless YOLOv5/YOLOv8 and ONNX model training post-annotation, with automated hyperparameter tuning and output of optimized `best.pt` or `best.onnx` model files.
- Designed the entire workflow to support rapid prototyping and deployment of custom object detection models for various use cases.
Stock market fluctuations analysis based on Stock news for Financial Analysis
- Tech Stack: Python, BeautifulSoup, Requests, Pandas, NLTK/VADER, Matplotlib, Yahoo Finance API, Groww Web Scraping, Jupyter Notebook
- Conducted financial analysis by correlating stock price fluctuations with sentiment analysis of stock-related news.
- Built a data pipeline to scrape stock price data from Yahoo Finance and related news from the Groww website, based on stock ticker and date range.
- Combined 7-day historical stock metrics (open, close, volume, etc.) with corresponding news headlines to create a unified dataset.
- Performed sentiment analysis using NLTK/VADER to quantify the emotional tone of news articles and assess their impact on stock performance.
- Automated data collection for a one-month period to identify patterns and correlations between news sentiment and market movement.
- Visualized findings through insightful plots and trends to support financial reporting and decision-making.
Audio translation with voice cloning for Speech Understanding
- Tech Stack: Python, FFmpeg, OpenAI Whisper, Coqui TTS, PyDub, IndicTrans2, Hugging Face Transformers
- Built a complete speech-to-speech pipeline to enable multilingual accessibility of classroom lectures with realistic voice cloning, especially for Indian languages.
- Extracted high-quality audio from a 2-hour classroom video using FFmpeg.
- Transcribed and detected source language using OpenAI Whisper for accurate speech-to-text conversion.
- Translated transcribed text into Indian languages (e.g., Hindi, Bengali, Tamil) using IndicTrans2 and Hugging Face Transformer models.
- Synthesized translated speech using Coqui TTS with speaker voice cloning for a natural and consistent audio output.
- Processed and merged audio segments using PyDub for clean and continuous playback.
- Designed for long-form content, with use cases in education, accessibility, and e-learning platforms.
Speech and Noise Separation with Classification for Forensic Audio Analysis for speech understanding
- Tech Stack: Python, Demucs, Spleeter, Noisereduce, Librosa, OpenSMILE, scikit-learn, PyTorch
- Built an advanced audio processing pipeline for forensic analysis by separating speech from multiple background noises and classifying environmental sounds.
- Leveraged Demucs and Spleeter for high-fidelity speech and noise separation in overlapping audio scenarios.
- Integrated Noisereduce and Librosa for noise suppression, feature extraction, and signal enhancement.
- Developed a noise classification module using OpenSMILE for feature extraction and scikit-learn for sound category prediction (e.g., crowd, vehicle, music).
- Implemented a demasking algorithm to reconstruct masked or distorted speech segments, improving clarity and speaker intelligibility.
- Designed for use in forensic environments, enabling investigators to extract usable speech and noise insights from complex audio evidence.
RAG-based Legal Document Intelligence System with Graph & Vector Databases
- Tech Stack: Python, LangChain, ChromaDB, Neo4j (Community Edition), Hugging Face Transformers, LLaMA 3 (8B via Groq), BEIR, SentenceTransformers, PyPDF2, NetworkX, Streamlit
- Designed and deployed a Retrieval-Augmented Generation (RAG) system to streamline legal research by enabling intelligent querying within and across legal documents.
- Deployed LLaMA 3 (8B) on Groq for low-latency, high-performance response generation integrated into the RAG pipeline.
- Trained a SentenceTransformer model using the BEIR benchmark dataset to enhance general-purpose semantic search capabilities.
- Replaced FAISS with ChromaDB, an open-source vector database, for managing dense document embeddings and efficient retrieval.
- Built graph representations of legal documents using Neo4j and NetworkX, enabling inter-document citation tracking and intra-document clause linking.
- Utilized Hugging Face Transformers (e.g., Legal-BERT) for embedding generation and legal language understanding.
- Extracted and structured text from court judgments and legal PDFs using PyPDF2.
- Developed a user-friendly Streamlit interface where legal professionals can query by natural language and receive targeted, explainable summaries and references.
- Significantly reduced time and cognitive effort in searching legal precedents and analyzing multi-case dependencies.