Summary
Overview
Work History
Education
Skills
Leadership Volunteering
Projects Competitions
Courses Certifications
Publications
Timeline
Generic

Akash Kumar

Summary

With over 1.8 years of industry and overall 5 years of experience in the field of data science, I have honed my expertise in developing cutting‐edge machine learning and deep learning solutions. My focus areas include Natural Language Processing (NLP), Natural Language Generation (NLG), data analysis, feature engineering, and model deployment. Notably, my 2 years of dedication to research in Computer Vision has resulted in the publication of two papers in prestigious journals such as IEEE and AIP, demonstrating my commitment to advancing the boundaries of knowledge in this domain. Proficient in Python programming and well‐versed in various ML and DL libraries and frameworks, including MLOps tools such as AWS Sagemaker, MLflow, and DVC, I am driven by an unwavering passion for innovation and continuous growth.

Overview

2
2
years of professional experience

Work History

Data Scientist

VMock India Private Limited
08.2022 - Current
  • Contributing to cutting-edge R&D initiatives at VMock, specializing in the development of advanced parsing engines using NLP & NLG for career analytics
  • Spearheading the development of generative AI aimed at intelligently paraphrasing text using LLMs to produce resume-specific content
  • Developed an end-to-end graph-based layout detection and merging module using a gradient boosting model fitted on graph edge attributes
  • Improved Named-entity recognition by 5% using BERT with integrated document layout analysis using subsection detection logic based on graph edges
  • Reduced complexity of the merging module & overall latency of the parser by 30%, by substituting NLP and rule-based models with a lightweight tree model
  • Deployed an end-to-end plagiarism detection module using Flask, Celery, Locality Sensitive Hashing and approximate-KNN search algorithm
  • Awarded model-builder of the year award for deploying 5 ML/DL models and developing 2 new capabilities for the product.

Education

B.Tech-M.Tech Dual Degree - ME

Indian Institute of Technology, Kanpur
05.2022

Higher Secondary School Certificate -

Air Force School, Gorakhpur (CBSE)
03.2017

Secondary School Certificate -

Air Force School, Gorakhpur (CBSE)
03.2015

Skills

  • Python Programming
  • SQL
  • C/C
  • Machine Learning
  • SQL Databases
  • Statistical Analysis
  • Scikit-Learn
  • Natural Language Processing
  • Feature Engineering
  • Data Mining
  • NoSQL Databases

Leadership Volunteering

  • Coordinator, Dance Club, Media & Cultural Council, IIT Kanpur, Champions Inter-IIT Cultural Meet 4.0 and First Runner-up Antaragni'19
  • Student Guide, Counselling Service, IIT Kanpur, Orientation'18 for 900 freshers and personally guided 5 freshmen

Projects Competitions

  • Predictor for Credit Card Penetration, American Express: Analyze This 2020, 10/2020, Customer segmentation based on profitability to increase customer referral penetration in digitally through high incentives in special-offer campaign, Feature selection by greedy elimination on feature-target correlation data achieving dimensionality reduction by 60 percent., Combined synthetic minority oversampling technique (SMOTE) and random undersampling with 0.3 & 0.6 sampling strategies, to address class imbalance, Enacted Gradient Boosting, Random Forest, and Artificial neural network on data, achieving maximum ROC AUC score of 0.764 on validation data, Awarded Pre-Placement Interview by bagging top position among 55 teams
  • Cuisine Prediction, Dr. Faiz Hamid, 06/2019, 07/2019, Classified 39774 dish ids containing 0.4 million ingredients into 20 cuisines using text mining and machine learning models, Similarity analysis of cuisines with Glove vectors and t-SNE along with visualization of feature importance plots to gain insights in cuisine and ingredients relation, Text cleaning by Lemmatization, transformation and feature extraction with DTM and TF-IDF matrix reducing parse 6703 columns to 3010 features, Ensembled models (SVM, LR, RF) fitted on the transformed data, gaining 80.6 percent accuracy and 0.74 F1 score on test data

Courses Certifications

  • Data Structures and Algorithms
  • Probability and Statistics
  • Data Mining and Knowledge Discovery
  • Applied Machine Learning in Python
  • Deep Learning
  • IBM Data Science

Publications

  • Computer vision-based on-site estimation of contact angle from 3-D reconstruction of droplets, IEEE Transactions on Instrumentation and Measurement, 08/2023, Paper, Code
  • Estimation of planar angles from non-orthogonal imaging, Review of Scientific Instruments, 01/2024, Paper, Code

Timeline

Data Scientist

VMock India Private Limited
08.2022 - Current

B.Tech-M.Tech Dual Degree - ME

Indian Institute of Technology, Kanpur

Higher Secondary School Certificate -

Air Force School, Gorakhpur (CBSE)

Secondary School Certificate -

Air Force School, Gorakhpur (CBSE)
Akash Kumar