Summary
Overview
Work History
Education
Skills
Work Preference
Timeline
SeniorSoftwareEngineer

Chanpreet Singh

Lead Data Scientist
Pune,MH

Summary

Data Scientist with more than 8 years of experience executing data-driven solutions to increase the efficiency, accuracy, and utility of internal data processing. Experienced at creating data regression models, using predictive data modelling, and analysing data mining algorithms to deliver insights and implement action- oriented solutions to complex business problems. Highly Skilled in Python, Deep Learning, Computer Vision Gen AI , LLM, and deploying Data Science Solutions.

Overview

9
9
years of professional experience
6
6
years of post-secondary education

Work History

Lead Data Scientist

T-Systems
Pune
09.2024 - Current
  • Developed and op7mized transformer-based language models, focusing on efficient a[en7on mechanisms and advanced tokeniza7on strategies
  • Leveraged techniques such as posi7onal encoding and mul7-head a[en7on to enhance model performance across diverse NLP tasks and process the result in Big query and Vertex AI
  • Designed and implemented scalable data pipelines to preprocess and feed large datasets into
  • LLM training workflows
  • Automated data extrac7on, cleaning, and feature engineering processes to support high-quality model training.
  • Collaborated with cross-functional teams to address business challenges using innovative techniques in big data analytics.
  • Managed multiple competing priorities effectively by setting clear goals, maintaining open communication channels, and fostering a collaborative work environment among team members.
  • Optimized business processes through the application of data-driven insights and recommendations.

Lead Data Scientist

Cognizant
Pune
11.2021 - 09.2024
  • Created Time Series forecast model using SARIMA for predic7ng the demand for providers and op7mized the u7liza7on of providers according to their availability and performance using Linear programming
  • Developed a system in which we are grouping the providers on the bases of their past performance using K-means clusters algorithm
  • Build an Ar7ficial Intelligence-based plahorm in which we can run any machine learning or deep learning model and show the output in JSON format
  • Used Flask, Django, Python
  • Docker, Machine Learning, Deep Learning, JSON, Mongo DB
  • Build a Neuro na7ve engine tool in which extracted fields of different structure documents using Rule-based regular expressions in node js and python and collected data from mul7ple sources and prepared training data that helps to achieve be[er accuracy
  • Developed a python processor tool in which we can run any python script in node js or angular js plahorm, install any python packages and visualize the output in UI
  • Data Visualiza7on using AWS Qlink, Load the data using an S3 bucket and created informa7ve charts that help to understand the data quickly.
  • Streamlined reporting processes by standardizing templates and automating routine tasks, saving time for more strategic activities.
  • Managed multiple competing priorities effectively by setting clear goals, maintaining open communication channels, and fostering a collaborative work environment among team members.

Data Scientist

Tata Consultancy Services
Bangalore
07.2016 - 11.2021
  • Build a NER model to predict the class and details of the product using the product descrip7on and Barcode and mapped all the 108 characters using a rule-based python model that saves lots of 7me of manual mapping that required lots of human efforts and 7me
  • Created a financial spreading automa7on tool that extracted data from a large collec7on of financial statements (PDF format) and built a text classifier using h-idf and SVM in python to map extracted data for a large UK client
  • Savings of 7me and cost associated with the manual process
  • Build an Image classifica7on model using Keras and CNN for classifying the scrapped images in respec7ve type (i.e., Shirts, Jackets, Coats, Trousers, etc.), images were scrapped using beau7ful soup, scrapy, Urllib, python-requests from 250 websites
  • Previously, image classifica7on has done manually, which was not accurate and consume so much 7me, and required more human interven7on
  • Build a model to find out the GDP of all the districts of India using 2011 census data of the
  • Indian Government in Python and R and show the result to the COO of TCS
  • Trained a team of 24 employees on Deep Learning, Machine Learning, and Python
  • Conducted deep-dive data analysis with Python, improving accuracy and reliability.

Education

Master Of Computer Applications - Computer Science

Sastra University
Thanjavur
2016.08 - 2019.07

Bachelor Of Computer Applications - Computer Applications

CSJM University
2013.07 - 2016.06

Skills

  • Operating Systems: Windows 11, Linux, MacOS
  • Databases: MySQL, PostgreSQL, MongoDB
  • Applications: Machine Learning, Data Preparation, Agile Methodologies, Data Structures and Analysis, Data Visualization, Business Analysis, Exploratory Data Analysis, ETL (Extract, Transform, Load), Data Modeling and Wrangling, Open AI, Gen AI, LLM models, Transformers, GAN, RNN, Web scraping, TensorFlow, Predictive Modelling, Quantitative Analysis, SQL/NoSQL, Data Flow Diagrams (DFD), Deep Learning, GCP, Big Query, Vertex AI, Data Flow, AWS Cloud, Data and Cloud Security, SWOT Analysis, Pa[ern Recognition, Statistical Analysis, Quality Management, REST APIs, CI/CD Pipelines, GitHub, Git, Data Warehouse, AI (Artificial Intelligence), Software Development, Application Development, JSON, Data Mining, Regression Analysis, Relational Databases, Big Data Analysis, PySpark, Grafana Presentation Skills, System Analysis, Quantitative Research, Project Management, Leadership, Marketing , Data Management, Predictive Analysis
  • Programming: Python, R, NodeJS, Java, JavaScript, OpenCV, Django, SQL, Go, Spark, Docker
  • Tools: Tableau, Microsoft Azure Portal, Jira, Matlab, Jupyter Notebook, Grafana
  • Neural networks
  • Experimental design
  • Feature engineering
  • Data wrangling
  • Machine learning
  • Data quality assessment
  • Natural language processing
  • Sentiment analysis
  • Optimization techniques
  • Reinforcement learning
  • Predictive analytics
  • Anomaly detection
  • Algorithm development
  • Statistical modeling
  • Opera7ng Systems:
  • Ensemble methods
  • Dimensionality reduction
  • Deep learning
  • Time series analysis
  • Data integration
  • Python programming
  • Advanced mathematical skills
  • Database management
  • Statistical analysis
  • Data mining
  • NoSQL databases
  • Data operations
  • Business requirements gathering
  • BI tools expertise
  • Regression library management
  • Teamwork
  • Problem-solving
  • Time management
  • Organizational skills
  • Team collaboration
  • Decision-making

Work Preference

Work Type

Full Time

Work Location

On-SiteRemoteHybrid

Important To Me

Career advancementWork-life balanceCompany CulturePersonal development programs

Timeline

Lead Data Scientist

T-Systems
09.2024 - Current

Lead Data Scientist

Cognizant
11.2021 - 09.2024

Master Of Computer Applications - Computer Science

Sastra University
2016.08 - 2019.07

Data Scientist

Tata Consultancy Services
07.2016 - 11.2021

Bachelor Of Computer Applications - Computer Applications

CSJM University
2013.07 - 2016.06
Chanpreet SinghLead Data Scientist