Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Timeline
Generic
Vikas S

Vikas S

Senior Data Scientist
Bangalore,Karnataka

Summary

A dynamic Senior Data Scientist with a proven track record, leveraging expertise in ML and Deep Learning SOTA Techniques to drive over significant improvements in Product development. Skilled in fostering innovation and efficiency through advanced machine learning techniques and collaborative problem-solving.

Overview

8
8
years of professional experience
5
5
years of post-secondary education
2
2
Certifications

Work History

Senior Data Scientist

Indegene
Bangalore
08.2021 - Current

AI-Powered Design-to-Code Automation Tool (PDF to Responsive HTML)

  • Product Overview: Spearheaded data science efforts for an AI product that automates the conversion of PDF-based emailers and e-detail designs into responsive HTML. This tool eliminates manual HTML development, streamlining workflows and improving productivity across design and development teams.
  • Key Modules :
  • Content Extraction: Utilized Azure Document Intelligence/AWS Textract to accurately extract text, images, fonts, and metadata from PDFs, ensuring seamless design-to-code translation.
  • Custom Object Detection: Designed and deployed YOLO v10 models to identify key layout features such as CTAs, lines, and other design elements critical for responsive HTML.
  • Generative AI Implementation: Engineered workflows using Claude, GPT and Gemini for automated HTML-to-MJML conversion for responsive HTML. Integrated advanced features like background gradient detection, CTA shape identification, and vertical text property extraction.
  • Layout Structuring: Built custom algorithms to detect multi-row and multi-column layouts, providing critical insights to front-end teams for responsive HTML generation.
  • Advanced Metadata Handling: Enhanced accuracy through Python-based symbol, hyperlink, and superscript extraction, enabling robust content representation.

Impact:

  • Achieved 70% reduction in development time for converting varied PDFs complexities into responsive HTML.
  • Improved content extraction accuracy to ensure high-quality outputs for production-ready designs.
    Adopted by 8-10 enterprise clients , each with 5+ developers , driving significant productivity gains across teams.
  • Scaled to handle diverse PDF complexities, enhancing throughput and reliability of the design-to-code workflow.

ICB (Intelligent Content Brain)

  • Project Overview: Developed an AI-powered tool that processes PDFs (e.g., emailers, e-details) to extract and analyze content intelligently using advanced vision and NLP techniques. This platform provides intelligent content analysis and actionable insights for end clients. Offered on a subscription-based model, it enables clients to process PDFs efficiently while ensuring high accuracy in content extraction and classification.
  • Key Contributions:
  • Key Message Tagging: Fine-tuned a multi-class classification model using BERT to identify and tag critical key messages, enhancing automated content analysis.
  • Image/Text Semantic Modeling: Implemented transfer learning with VGG16 and fine-tuned BERT to build a robust image-text similarity model, enabling cross-modal content alignment.
  • Multilingual NER Model: Utilized SpaCy and Hugging Face frameworks to extract named entities in 12+ languages, expanding the tool's global applicability for diverse clientele.
  • Document Slicing: Fine-tuned a VGG16-based model to classify valid and invalid slices of documents, ensuring precise segmentation of relevant content.
  • MLOps Implementation: Designed a comprehensive data science pipeline with: Data Collection: Automated workflows for ingesting and preprocessing PDF datasets. Model Retraining: Built custom pipelines leveraging AWS for iterative model updates based on feedback loops. Model Deployment: Utilized MLflow for seamless deployment and version management.

Impact:

  • Achieved 70~80% accuracy across domain-specific models with iterative retraining, ensuring reliable content extraction for enterprise clients.
  • Enabled seamless processing of PDFs on demand, supporting multiple clients under subscription models for PDF processing.
  • Delivered a scalable MLOps dashboard that streamlined workflows for data scientists, reducing pipeline latency and enhancing operational efficiency.

Business Analyst

Genpact
Bangalore
10.2019 - 08.2021
  • Binary Classification for Bad Debt Estimation: Developed a binary classification model using ensemble techniques to estimate the probability of customers being a Bad Debt. Enhanced prediction accuracy with multiple model integration. Impact: Saved ~25% revenue YoY by proactively managing financial risks.
  • Time Series Modeling for Pharmaceutical Sales Prediction: Utilized the Holt-Winters model for forecasting units and sales of multiple pharmaceutical brands over 2 years. Improved accuracy by applying better seasonality and trend factors. Impact: Achieved 10-15% higher accuracy compared to the previous SAS model, improving sales predictions.
  • Engineering Activities - End-to-End Automation:
    Automated legacy SSIS workflows using Python, eliminating manual steps in ETL processes. Impact: Saved 50+ hours/month , reducing 5 FTRs and improving operational efficiency.

Software Engineer

LTI
Bangalore
12.2016 - 09.2019
  • Worked on Predictive Modelling for internal stakeholders to estimate SLA for Priority Incidents, Resource Estimation based on SLA which enabled 20-25% better use of Resource Management.
  • Led Server Infrastructure Development and Management of Production systems for ETL(informatica)
  • Involved in Automation process of Admin Console Monitoring which resulted in savings of manual 15% efforts

Education

PGP-Business Analytics And Business Intelligence - Data Science

Great Learning | University of Texas, Austin
Bangalore
01.2019 - 01.2020

Bachelor of Engineering - Electronics And Communications Engineering

Vemana Institute of Technology
Bangalore
06.2012 - 06.2016

Skills

Certification

AWS Certified Machine Learning Specialty Credential URL: https://www.credly.com/badges/f8f42321-1a9f-4c3e-a72c-22f13cdb803c/public_url

Accomplishments

  • Passionate Mentor
    Dedicated to empowering learners, Have impacted over 237 learners, delivering 176 hours of teaching and mentoring. My commitment to positive engagement has earned me outstanding ratings of 4.74 for mentoring.

Timeline

Senior Data Scientist

Indegene
08.2021 - Current

Business Analyst

Genpact
10.2019 - 08.2021

PGP-Business Analytics And Business Intelligence - Data Science

Great Learning | University of Texas, Austin
01.2019 - 01.2020

Software Engineer

LTI
12.2016 - 09.2019

Bachelor of Engineering - Electronics And Communications Engineering

Vemana Institute of Technology
06.2012 - 06.2016
Vikas SSenior Data Scientist