Summary
Overview
Work History
Education
Skills
Websites
Certification
RESEARCH
Languages
Work Availability
Quote
Timeline
CustomerServiceRepresentative
Kanishk KUMAR

Kanishk KUMAR

Noida

Summary

Data Scientist familiar with gathering, cleaning and organizing data for use by technical and non-technical personnel. Highly organized, motivated and diligent with significant background in Data Science, Machine Learning and Python Language field. A hardworking and passionate job seeker with strong organizational skills eager to secure an Associate Analyst position with 1+ total years of experience

Overview

4
4
years of professional experience
1
1
Certification

Work History

Associate Sports Data Scientist - Internship

Higher School of Economics
Moscow
01.2023 - 06.2023
  • To Analyze Football Matches and Events Data uses Machine Learning models like Logistic Regression, K-Means Cluster Model and Sequential Neural Networks
  • Applied loss functions and variance explanation techniques to compare Different AUC Curves and accuracy score performance metrics.
  • Pandas Library for reading and writing spreadsheets.
  • Numpy Library for carrying out efficient computations.
  • Matplotlib Library for visualization of data.
  • Statistical Methods for sports data can improve decision-making, Large Short Term Memory(LSTM) and Clustering
  • Devised and deployed predictive Machine Learning models using Piecewise Regression, Classification and Reinforcement algorithms to drive business decisions.
  • Compiled, cleaned and manipulated data for proper handling by Regular Expression Techniques and JSON library.
  • Determine the Pass Count, Likely Passes, Predictability and Estimate the Pass Difficulty in Football

Associate Analyst

GlobalLogic Technologies Pvt Ltd
Gurugram
11.2022 - 05.2023
  • Employed for Google Data Project sources based on Python Language.
  • Using Google Data analysis tools and resources.
  • Handling problems in Computer Vision tasks including image classification, detection, face recognition, etc.
  • Understanding Natural Language Programming systems and effectively using text representation techniques.
  • Utilized advanced PostgreSQL, DBeaver and Celery querying, data visualization and analytics tools to interpret trends in complex data sets.
  • Evaluated Machine Learning methods and made appropriate changes to increase productivity.
  • Stayed current on developments in related Python frameworks like Flask, Docker and Apache Spark and worked independently to design, develop and test code.
  • Analyzed data sets using Python, statistical modelling technologies and tools to effectively meet company analysis.

Technologies - Project Intern

AirCode Technologies Pvt. Ltd.
Lucknow
12.2019 - 05.2020
  • I have provided technical support services for maintaining Python Language codes, debugging project problems, enhancing layouts and other services
  • Designed, built and data infrastructure to support business initiatives
  • Optimized data access and storage to improve the performance of analytics systems, produced monthly reports using Advanced Microsoft Excel spreadsheet functions
  • Acquired knowledge of industry trends, developed solutions, strategy through effective research and presentations for project needs.
  • Developed reports using SQL server reporting services.
  • Identified bugs in data collection or reporting and investigated root causes.

Education

Master of Science - Data Science

Higher School of Economics
Moscow, Russian Federation
07.2023

Bachelor of Technology - Food Technology

Uttar Pradesh Technical University, Harcourt Butler Technological Institute
Kanpur, India
06.2017

Skills

  • Python Language
  • SQL, Relational Database and Data Description Language(DDL)
  • PostgreSQL and Dbeaver
  • Data Analytics
  • Python, Class, Function
  • Numpy, Pickle and Panda library
  • Apache Spark
  • Docker
  • MLflow
  • Flask and Celery
  • Microsoft Excel
  • Data Collection
  • Data Visualization
  • Machine Learning
  • Natural Language Programming
  • Data Mining
  • Data Distribution
  • Data Structure and Algorithms
  • Applied Statistics and Basic Statistics
  • Discrete Mathematics
  • Sentiment Analysis
  • Cluster Analysis
  • Unsupervised Learning
  • Predictive Models
  • Structured and Unstructured Data
  • Linear Regression
  • Outcome Predictions
  • C Language
  • Decision-Making

Certification

IBM - Python 101 for Data Science

RESEARCH

Health and Medicine Assistant Chatbot Project [Skill Hackathon] - Moscow, Russia 11/2022 - 02/2023

  • It based on Dialog Flow Framework(DFF) using Dialog Flow Engine (DFE) allows you to write conversational services such services chat-bots for social networks.
  • It having Data folder and Data files and 6 Different Languages.
  • It take the response for User by Medical Chatbot Services. Create mainstream languages in English & non-mainstream languages in Spanish, Dutch, Russia, Danish and Greek.

MLflow Docker Flask-Iris-Classification Application Project [Large Scale Machine Learning] - Moscow, Russia 03/2023 - 05/2023

  • This is an Iris Flower Classification model deployment project as a Flask Application.
  • Build an image (Dockerize) and run it on a Docker container.
  • MLflow Architecture model - MLflow Project, Tracking, Models and Registry
  • Asynchronous Celery library for building distributed message queue applications.
  • Deploying a Machine Learning model to production and Exploratory Data Analysis With Iris Flower
  • HTML Frontend Web browser and Flower Measurement Data predictions.

Advanced Python Spark DataFrame and HDFS RDD Project [Outbrain Click Prediction] - Moscow, Russia 01/2023 - 03/2023

  • Use Hadoop Distributed File System for data load and processing.
  • HDFS RDD for the transformation of data procedure and collection.
  • To Analyse the Data with Spark DataFrame with Pyspark and SQL.
  • Calculate the most visited document and topic identity in the page views data.
  • Calculate users have different traffic sources in the page views data and its encoded data.

Modern Data Analysis Machine Learning Project [IMDb movies Dataset] - Moscow, Russia 09/2022 - 10/2022

  • K Neighbors Classifier, Decision Tree Classifier and Logistic Regression Model.
  • Ski-Learn library of features are metrics, accuracy score, classification report, confusion matrix and model selection.
  • Data Visualization techniques & tools: matplot library, seaborn scatter plot library, cross validation metric.

SQL Sales Planning and Evaluation Project [Analysis Bike Production] - Moscow, Russia 05/2022 - 08/2022

  • Relational Database for plan Data Management, Data Manipulations, Plan management, Materialized Views, Data Description Language(DDL)
  • SQL Query with avg, max, count function, etc
  • SQL Operators and Clauses, Join, Views, table
  • Environment: PostgreSQL, Dbeaver

Python Advanced CountVectorizer Project [Tf-Idf] - Moscow, Russia 10/2021 - 12/2021

  • Convert a collection of text documents to a matrix of token counts using Advanced Python Language and Library.
  • Method: Term frequency and inverse document frequency

Languages

English
Advanced (C1)
Hindi
Advanced (C1)

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Quote

The opposite of a true statement is a false statement, but the opposite of a profound truth may well be another profound truth.
Niels Bohr

Timeline

Associate Sports Data Scientist - Internship

Higher School of Economics
01.2023 - 06.2023

Associate Analyst

GlobalLogic Technologies Pvt Ltd
11.2022 - 05.2023

Technologies - Project Intern

AirCode Technologies Pvt. Ltd.
12.2019 - 05.2020

Master of Science - Data Science

Higher School of Economics

Bachelor of Technology - Food Technology

Uttar Pradesh Technical University, Harcourt Butler Technological Institute
Kanishk KUMAR