Summary
Overview
Work History
Education
Skills
Licenses & certifications
Publications
Projects
Accomplishments
Timeline
Generic

Anto Lourdu Xavier Raj Arockia Selvarathinam

Data Analyst Student
Salem,TN

Summary

Data Analyst with Broad-based experience in building data-intensive applications, overcoming complex architectural, and scalability issues in diverse industries. Proficient in predictive modeling, data processing and data mining algorithms, as well as scripting languages, including Python and Java. Capable of creating, developing, testing, and deploying highly adaptive diverse services to translate business and functional qualifications into substantial deliverables.

Overview

5
5
years of post-secondary education
2
2
Languages

Work History

Business Analytics Trainee

MedTourEasy
New Delhi, India
10.2020 - 11.2020
  • In this internship each and every task is assigned and done only through LMS.
  • The initial task assigned to me by my training head in LMS is about completion of 14 guided projects powered by coursera project Network.
  • After completion of those guided projects I need to submit all the certificates for proof of concept in LMS.
  • Once my training head inspect and verified all my certificate proofs he auto enrolled me towards the start of final project.
  • The final project assigned to me is "Classify Suspected Infection in Patients using R programming".
  • After completion of my final project I've submitted it in LMS as project report and got certificate of appreciation from my training head.

Data @ ANZ Program

Forage
San Francisco, California
10.2020 - 10.2020
  • In this virtual internship programme they've assigned two task to be completed(i.e Exploratory Data Analysis, Predictive Analytics).
  • In exploratory data analysis they've asked to segment the dataset given in .csv file and present it in visualization form of transaction volume and conclude the task with respective outliers highlighted.
  • In predictive analytics they've asked to explore correlation between the attributes which is customer name as prominent one and build regression as well as decision tree prediction model based on the prediction obtained.

Global Data Analytics Intership

Takenmind Technologies
Gurugram, Haryana, India
10.2020 - 10.2020
  • This internship is mainly focused on learn come explore.
  • First to start up with we need to purchase the study kit from Udemy it's free of cost and need to read and master the concepts which they've taught.
  • After this done they'll assign totally 4 assignments and 2 projects so in each one task is optional.
  • Assignment 1 is mainly based on Numpy and Pandas concepts. So we need to make a documentation of about 500 words this inclusive of code too it's a mandatory assignment.
  • Assignment 2 is mainly based on File Operation concepts and Data Analysis Techniques. In this task we need to create single .xlsx file with 10 sheets filled with dummy data then we need to read the .xlsx file using Pandas and export every single sheet of .xlsx file as .csv file. It's a mandatory task to be completed.
  • Assignment 3 is also a mandatory task. This task is mainly based on Data Visualization concepts we need to download the given dataset. after downloading it we need to create a pivot table by marking along x and y axis labels atleast we need to plot the heatmap using seaborn from the pivot table created.
  • Assignment 4 is optional. I've ignored.
  • Project 1 is a mandatory thing(i.e Proof of concept). First they'll give the project problem dataset and we need to download the given dataset. After this gets downloaded we need to formulate the methodology and techniques by ourself and present it in MS PowerPoint file. This Proof of concept should be of maximum 15 slides for sure.
  • Project 2 is optional so I've ignored. So that's all about this internship through this one month internship I've learned the Data Analytics concepts as well as sincerity(i.e submission of projects and tasks should be within time).

Data Science and ML Intern

The Sparks Foundation
Singapore
09.2020 - 10.2020
  • This internship is mainly based on content related practical activity. These assigned some few task to be complete over.
  • All task which they assigned should be stored and maintained in Github repo for future reference.
  • Now coming to the tasks. The first task which they assigned is to maintain atleast 500+ linkedin connection. This helped me really lot to share and gain knowledge from other connections.
  • Task 2 which they've given is to explore linear regression using the iris dataset given. I've done with this task and maintained in ma Github repo.
  • Task 3 which they've assigned is to explore unsupervised machine learning from iris dataset given. I've completed and maintained in ma Github repo.
  • Task 4 is to perform clustering using the same dataset after completion I've updated same in my Github repo.
  • Task 5 is to perform EDA which is quite complex and dataset provided by them is completely different from other tasks. I've completed this too and maintained in ma Github repo.
  • All the task which I've done during this internship programme is executed only in Jupyter environment with python as key language.

Data Science Intern

Exposys Data Labs
Bengaluru, Karnataka
09.2020 - 09.2020
  • In this internship the main task is to perform Customer Segmentation using K-means algorithm in python (or) R programming.
  • I've selected R programming and I've implement the task.
  • They assigned totally four task to be completed.
  • The first task they've given is to implement a code based on the problem statement which is Customer Segmentation using K-means
  • Next they've asked to prepare a report based on the topic.
  • Then they've asked to prepare a presentation based on the topic Customer Segmentation.
  • Finally they've asked me to explain each and every code in the form of code demo which I've implemented using various packages and algorithms to derive exact solution for the problem statement asked.
  • After completion of all the assigned task I've got the certificate of appreciation.

KPMG Data Analytics Virtual Intern

InsideSherpa
San Francisco Bay Area
08.2020 - 08.2020
  • In this virtual internship programme I've done 3 modules (i.e Data Quality Assessment, Data Insights, Data Insights and Presentations)
  • In Data Quality Assessment I've downloaded the errored dataset which they've given and I've rectified the dataset and converted that dataset which is .csv file into word document and I've uploaded.
  • In module 2 which is Data Insights I've prepared the presentation based on the topic given. The topic given to me is data exploration, data economy, statistical approach.
  • In last module which is Data Insights as well Presentation they gave the cryspt dataset to be develop. I've extracted the dataset and worked on visualization tool which is tableau to complete the required task. The task is mainly based on segmenting average no of persons according to their requirements visually.

Data Science Intern

Kaashiv Infotech
Chennai , Tamil Nadu
08.2020 - 08.2020
  • Worked on structured and unstructured dataset
  • Through this intern learned about various field differences(i.e ML, AI, Deep learning, BigData.
  • Performed Data Science Mean Manipulation by importing libraries as pandas, created Data Frame, calculated mean if the data, Trimmed and Untrimmed the specific value from the dataset in Jupyter environment with Python language.
  • Also performed PMF same like importing libraries as matplot, created Dictionary, created PMF using for loop and plotted the data using plt function
  • Lastly performed various statistical terminology(i.e R-squared formula, Mean, Median, Mode, Range, quartile, regression) using the given data.
  • Also worked on mtcars Dataset using R studio how to access, validate the datasets and so on.

Education

10Th Standard -

Holy Cross Matriculation Higher Secondary School
Ammapet, Salem, Tamil Nadu 636014
06.2014 - 05.2015

12Th Standard -

Holy Cross Matriculation Higher Secondary School
Ammapet, Salem, Tamil Nadu 636014
06.2016 - 04.2017

B.E Computer Science -

Sona College of Technology
Junction Main Rd, Salem, Tamil Nadu 636005
08.2019 - 05.2023

Skills

    Business analytics

undefined

Licenses & certifications

  • Step Into Robotics Process Automation, GUVI Geek Networks
  • NCC Covid19 Awareness Certification, Loyola-ICAM College Of Engineering And Technology
  • Data Science In Digital Marketing, Google Digital Unlocked
  • A-Z Data Science Certificate Of Completion, Udemy
  • What Is Data Science ?, IBM
  • The Data Scientist's Toolbox, The John Hopkins University
  • Statistical Interference, The John Hopkins University
  • Reproducible Research, The John Hopkins University
  • Regression Model, The John Hopkins University
  • R Programming, The John Hopkins University
  • Practical Machine Learning, The John Hopkins University
  • Learn to Code With Python3, Udemy
  • Getting And Cleaning Data, The John Hopkins University
  • Exploratory Data Analysis, The John Hopkins University
  • Data Science Orientation, Coursera
  • Tools For Data Science, IBM
  • Python For Data Science And AI, IBM & Coursera
  • Programming In C, Sona College Of Technology
  • Developing Data Products, The John Hopkins University
  • Databases And SQL For Data Science, IBM & Coursera
  • Data Science Methodology, IBM & Coursera
  • Data Science Capstone, The John Hopkins University
  • Data Analysis With Python, IBM & Coursera, My Captain
  • Machine Learning With Python, IBM & Coursera
  • IBM Data Science, Coursera
  • Data Visualization With Python, IBM & Coursera
  • Data Science Professional Certificate, Coursera
  • Big Data Workshop On Covid-19 Lockdown Analysis, Edureka
  • Applied Data Science Capstone, IBM
  • Visualizing Citibike Trips With Tableau, Coursera Project Network
  • Predictive Modelling With Azure Machine Learning Studio, Coursera Project Network
  • Pandas Python Library For Beginners In Data Science, Coursera Project Network
  • NLP : Twitter Sentiment Analysis, Coursera Project Network
  • Merge, Sort And Filter Data In Python Pandas, Coursera Project Network
  • Introduction To Python, Coursera Project Network
  • Intro To Time Series Analysis In R, Coursera Project Network
  • Getting Started In Google Analytics, Coursera Project Network
  • Exploratory Data Analysis With Seaborn, Coursera Project Network
  • Exploratory Data Analysis With Python and Pandas, Coursera Project Network
  • Create Interactive Dashboards With Streamlit And Python, Coursera Project Network
  • Covid19 Data Analysis Using Python, Coursera Project Network

blankContentIdentifierblankContentIdentifier

Publications

Multiple Heterogeneous Information Source

INTERNATIONAL ACADEMIC JOURNAL OF SCIENCE AND ENGINEERING (IAJSE), ISSN 2454 - 3896, May 7 2021

Publication URL

https://drive.google.com/file/d/198S3dVwgdnJ14eZwkT-6Py6v-BPDtmC9/view?usp=drivesdk

Description

- In Multiple Heterogeneous Information Sources (MHIS) one base station network namely ad-hoc network receives internet from cloud and shares those internet to various electronic gadgets through data transmission cable or by light weight sensor.

- Here ad-hoc network refers to primary source and electronic gadgets refers to secondary source. while transferring data to secondary source some transmission loss occurs which reduces the arrival rate and in my paper I used the transmission loss terminology as packet loss and this packet loss can be rectified by using Rapid package transmission algorithm(RPTA).

- The main role of RPTA algorithm is to remove the duplicate packet loss from the source network and transfers the speedy packet flow transmission without any loss to the destination (i.e second source).

- I used two mechanism namely fault tolerant mechanics(FTM) and fault detection mechanism (FDM) in my proposed method. FTM can be applied to sensor networks for the maximum availability of operating node connectivity between source and destination of sensor network. In FDM wireless hubs are used to overcome any packet drop loss in source and destination of sensor network. wired hubs are used for tunneling and traffic examinations of network sources.

- In result and discussion the analysis of packet delivery ratio is used to test the quality of the source and destination network, end to end delay is the difference between sender and receiver time in optimization and clustering of network sources and throughput ratio defines how much node.


Video Stream Analysis In Clouds Framework For High-Performance Video Analysis Using Video Synchronization Analysis Algorithm

International Research Journal Of Engineering And Technology(IRJET) An ISO 9001:2008 Certified Journal, Jun 19, 2021

Publication URL

https://drive.google.com/file/d/1WXCYwiQuuOxsLr5OuV_DEuAsPnuvq5Yv/view?usp=drivesdk

Description

- The main motive of this publication is to overcome inaccurate information in human involvement. This achievement can be done using high performance video analysis by cloud computing.

- This theorithical scenario helps to minimize human intervention, time calculations and enables large number of processing video streams.

- In proposed video processing platform I have used video synchronization analysis algorithm (VSAA) in order to deliver high data transmission performance and to evaluate large number of video streams to reduce data transmission.

- My work has been primarily divided into two sub evaluation. First Evaluation is to evaluate the extensibility of the information platform and Second Evaluation is to perform a node in personal cloud. The main role of node is to limit the actual experimental measurements

- In video networks this terminal application is used in processor power, screen resolution to adapt different source code and to be decoded.

- In Hadoop based platform services the role of video stream analysis in cloud frameworks have it's own impact compared to any other.

- Video analysis framework is utilized to store unique recording. The weblog documents record the conduct of interpersonal video applications that clients can access through it at their leisure time.

- I have used Sparkle framework in my research work. The main advantage of sparkle is a quick framework that can be able to handle huge information analytics applications.

- Video Synchronization is a proposed system in video analysis algorithm. In framework Video Synchronization helps in processing video streams efficiently and reduces processing delays using an installation server in cloud.

- Administrators utilizing this structure indicate logical standards and video span transfers for examination in VSAA

- In Video transfer flow the framework is installed on a cloud computing server and uses a GPU to reduce latency in video analytics process.

- Results are obtained precisely.


Scientific Comprehension Of Learning Particularly Deep Learning Algorithms

International Research Journal Of Engineering And Technology(IRJET) An ISO 9001:2008 Certified Journal, Aug 31, 2021

Publication URL

https://drive.google.com/file/d/15VF5KBLkQtVyW0OSjOyLasGwkMVBV_fG/view?usp=drivesdk

Description

- Healthcare organizations of all sizes, types and specialities are becoming increased.

- DL is becoming transformative for healthcare Deep learning provides automatic detection of interactive automatic exploration of the object features and functionality.

- However, in the medical analysis, it has not yet been fully developed its potential. To solve above issues Restricted Deep Boltzmann Machines (RDBM) is proposed algorithm it provides a new and effective paradigm in order to obtain the end-to-end learning model, from complex data.

- DL and other performance of ML technology in the context it will increase data size. The main advantage of the increase in DL large-scale architecture, the existing data will also increase the size of the DL.

- In general, medical field, deep learning will provide the features and functions and interactive auto-discovery and auto-detection of the object.

- Deep learning, paved the way for hospitals, by providing a cloud service provider, and the power and efficiency unprecedented in the mining multimodal unstructured information of large stored in research institutions, because of the personalized health management.

- The purpose of these method is to emphasize the application of important principles and DL in the health care and medical fields.

- The proposed algorithm Restricted Deep Boltzmann Machines (RDBMs) has two layers which are visible and hidden layers. Hidden layers made with nodes it extract feature information from the data then output is calculated with sum of input layers.

- Deep learning algorithm has been used as being the main data set and it is balanced application or synthetic data, has been added in order to achieve workaround.
- Wide spread of EHRs, and a large amount of information is urgently require effective tool to convert the data into the conclusion, knowledge and action.

- Wide spread of EHRs, and a large amount of information is urgently require effective tool to convert the data iaction.


Projects

  • Classify Suspected Infection In Patients
  • Covid 19 Data Analysis
  • Time Series Analysis
  • Tumor Diagnosis Using EDA
  • Explore Business Analytics
  • Explore Decision Tree Algorithm
  • Prediction Based Model On Linear Regression
  • Unsupervised Machine Learning Using Clusters
  • Movies Recommendation System Using CF
  • Statistical Analysis Of User's Banking Aspects Using Tableau

Accomplishments

Covid 19 Crowdfunding Award

Issued by Hamari Pahchan NGO • Jun 2020

- Received a certificate from Sarvesh Agarwal CEO(Hamari Pahchan NGO) for Covid 19 Crowdfunding Awareness program done through intershala.

Best Public Speaker Award

Issued by Sona College Course Instructor • Nov 2019

- Spoke about the importance and values of "IOT" at college level.

- Through this topic. I engaged myself giving various examples collecting data from various domain and sources.

- Also displayed much more orientation video and ideas on IOT during my presentation.

- Highlighted point I've mentioned in my conclusion is with the help of IOT. The productivity level and economic level get raises day by day with huge impact factor.

Guiness World Record

Issued by Elite, Asian, Indian, Tamil Nadu World Records Private Limited • May 2016

- Made a Guiness World Record by a team for Longest Indian Folk(Live Music) Concret for nearly 43hrs.

- I am one of the participant to achieve and accomplish the record.

- This approval is given by Elite World Records Pvt Ltd, Asian Records Academy, Indian Records Academy, Tamilian Book Of Records.

Best Sportsmanship Award

Issued by Principal, Holy Cross Matriculation Higher Secondary School • Nov 2013

- By analyzing my past and current achievements in sports activities.

- My school awarded me as "Best Sportsmanship Award" during my school sportsday in the year 2013.

Excellence Education Award

Issued by Anna University Vice Chairman • May 2011

- Won Gold medal in securing more Marks among all students at School level for consecutively five years.

Best Public Speaker Award

Issued by Principal, St John's Matriculation Higher Secondary School • Aug 2010

- Interschool cultural competition had been conducted in my school when I was in 3rd grade.

- At that time I've participated and won first prize.

- The topic which I had spoke is based on "Harmony".

Golden Pencil Award

Issued by Principal, Camilin Chairman • Sep 2007

- Nearly 100 students from my school participated in Camilin Handwriting Competition.

- Among all, My handwriting had been chosen as the best and got certificate as well as medals from Camilin Chairperson in the year 2007.

Timeline

Business Analytics Trainee

MedTourEasy
10.2020 - 11.2020

Data @ ANZ Program

Forage
10.2020 - 10.2020

Global Data Analytics Intership

Takenmind Technologies
10.2020 - 10.2020

Data Science and ML Intern

The Sparks Foundation
09.2020 - 10.2020

Data Science Intern

Exposys Data Labs
09.2020 - 09.2020

KPMG Data Analytics Virtual Intern

InsideSherpa
08.2020 - 08.2020

Data Science Intern

Kaashiv Infotech
08.2020 - 08.2020

B.E Computer Science -

Sona College of Technology
08.2019 - 05.2023

12Th Standard -

Holy Cross Matriculation Higher Secondary School
06.2016 - 04.2017

10Th Standard -

Holy Cross Matriculation Higher Secondary School
06.2014 - 05.2015
Anto Lourdu Xavier Raj Arockia SelvarathinamData Analyst Student