Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Work Availability
Languages
Timeline
Work Preference
Intern
Soumik Hati

Soumik Hati

Data Scientist
Iswarpur, Birbhum,WB

Summary

Data Science and AI professional with hands-on experience delivering end-to-end Machine Learning and Deep Learning solutions in NLP and Computer Vision. Developed a scalable Extractive Text Summarization system using Python and TF-IDF with multi-format input support, focused on efficiency and real-world usability. Built a healthcare AI pipeline for histopathological image analysis using U-Net for tumor segmentation and ResNet/VGG for classification, achieving 92%+ accuracy with strong Dice and IoU metrics. Strong in data preprocessing, model evaluation, and performance optimization. Passionate about building scalable, business-impacting AI solutions and contributing to high-performance teams in global organizations.

Overview

7
7
Certifications
1
1
Language

Work History

Data Scientist Intern

Augmenza Tech Private Limited
Bhopal, India (Remote)
12.2025 - 01.2026
  • Supported staff members in their daily tasks, reducing workload burden and allowing for increased focus on higher-priority assignments.
  • Gained valuable experience working within a specific industry, applying learned concepts directly into relevant work situations.
  • Analyzed problems and worked with teams to develop solutions.
  • Contributed to a positive team environment by collaborating with fellow interns on group projects and presentations.
  • Gained hands-on experience in various software programs, increasing proficiency and expanding technical skill set.
  • Participated in workshops and presentations related to projects to gain knowledge.

Education

Bachelor of Technology (B.Tech) - Computer Science (Data Science Focus)

Techno Main Salt Lake
Kolkata
04.2001 -

Skills

Programming Languages: Python, SQL, R

AI & Machine Learning: Natural Language Processing (NLP), TF-IDF, Machine Learning, Deep Learning

Data Analysis: Statistical Analysis, Hypothesis Testing, A/B Testing

Data Tools & Visualization: Pandas, NumPy, Power BI, Tableau, Excel (Advanced and Pivot Tables)

Cloud & Data Engineering: AWS, ETL, Data Wrangling, Data Cleaning

Databases: SQL, MySQL, MongoDB, Redis

Accomplishments

Histopathological Image Segmentation and Classification (June 2024-Dec 2024): Built an end-to-end Deep Learning pipeline for automated histopathological image analysis to support accurate cancer diagnosis. Implemented U-Net for tumor and nuclei segmentation, achieving strong Dice and IoU scores. Developed CNN-based classification models using ResNet and VGG to classify tissue samples (benign vs. malignant), achieving 92%+ accuracy on BreakHis and Camelyon16 datasets.
Applied preprocessing techniques including color normalization, patch extraction, and data augmentation to improve model generalization. Performed model evaluation using accuracy, Dice coefficient, and IoU metrics.
Utilized Python, Deep Learning, Computer Vision, CNN architectures, and medical image processing to deliver a scalable healthcare AI solution.

Extractive Text Summarization Tool (NLP, TF-IDF) (Jan 2025-Jun 2025): Developed an end-to-end Extractive Text Summarization system using Python and NLP to generate concise summaries from large documents. Implemented a TF-IDF–based sentence ranking algorithm to identify and extract the most relevant information with high accuracy and readability.
The application supports multi-format inputs including direct text, TXT/PDF uploads, and Wikipedia links, ensuring flexibility and real-world usability. Designed the solution to be lightweight, computationally efficient, and scalable.
Demonstrated expertise in Python, Natural Language Processing, text preprocessing, feature extraction, and algorithm optimization while delivering a practical AI-driven solution.

Certification

Getting Started with Enterprise-Grade AI – IBM

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Languages

English
Intermediate (B1)
Hindi
Advanced (C1)
Bengali
Bilingual or Proficient (C2)

Timeline

Data Scientist Intern

Augmenza Tech Private Limited
12.2025 - 01.2026

Bachelor of Technology (B.Tech) - Computer Science (Data Science Focus)

Techno Main Salt Lake
04.2001 -

Work Preference

Work Type

Full Time

Location Preference

On-SiteRemoteHybrid

Important To Me

Career advancementWork-life balanceCompany CultureFlexible work hoursWork from home optionPersonal development programs
Soumik HatiData Scientist