Summary
Overview
Work History
Education
Skills
Timeline
Generic

Ashok Arun Kumar Selvaraj

CHENNAI

Summary

Highly skilled Technical Lead with deep understanding of software development, project management and team leadership. Brings strong problem-solving skills, technical proficiency, and ability to deliver high-quality work on time. Key player in driving product innovation and improvements in previous roles. Excel in leadership, communication, and problem-solving to deliver projects on time. Committed to fostering teamwork and driving technical innovation. Known for fostering positive team dynamics and ensuring customer satisfaction.

Overview

17
17
years of professional experience

Work History

Technical Lead

Wipro
CHENNAI
10.2022 - Current
  • Evaluated existing applications for defects or improvements in functionality or performance.
  • Conducted code reviews to ensure high-quality code was produced that adhered to coding standards.
  • Implemented continuous integration, continuous delivery pipelines for automated deployments.
  • Designed and developed end-to-end data pipelines using Azure Data Factory and Azure Databricks, resulting in data processing efficiency.
  • Tuned and transformed the ETL jobs for Performance Enhancement.
  • Implemented data storage solutions using Azure Data Lake Storage and Azure Blob Storage, ensuring data is securely stored and easily accessible.
  • Designing and creating the best possible ETL process and scripts to read, analyze and digest what a business wants to accomplish with its data
  • Performed Unit testing and SIT for jobs designed to ensure that it meets the requirement
  • Worked closely with customers and other stakeholders to determine planning, implementation and integration of system-oriented projects
  • Monitored and optimized the performance of data pipelines and storage solutions to ensure reliability and cost-efficiency
  • Experience in reviewing Python code for running the troubleshooting test-cases and bug issues.
  • Using Python created batch scripts to automate the ETL scripts runs every hour.
  • Created Azure logic apps workflow to integrate Sharepoint files with Databricks and send automated mails once the transformation is completed.
  • Created Dashboards and reports in PowerBI
  • Developed Azure Data Factory flows to automate the pyspark jobs in Azure Databricks for data ingestion and mining various data.
  • Developed and automated the frequently using scala and spark using CD/CI tools.
  • Used Bitbucket/gitlab/github vastly for the code deployment and storage.

Associate Architect/Lead Data Engineer

Oasys Cybernetics
CHENNAI
06.2021 - 09.2022
  • Analyze large amounts of data sets to determine optimal way to aggregate and report on it.
  • Designing ETL processes, which transformed and pulled to data warehouse from mysql into hadoop along with logic that took a complicated process, which involved business logic calculations serving as a reporting source.
  • Develop simple to complex Spark Jobs using Hive to cleanse and load mysql/postgres data.
  • Managing various Python files in OpenStack environment and make necessary changes if needed.
  • Experienced in designing data lake and data warehousing architectures in on-premise/cloud environments.
  • Experience in reviewing Python code for running the troubleshooting test-cases and bug issues.
  • Used Dbeaver sql client tool to built the frequently used queries in Postgresql.
  • Retrieved data and analyzed it using spark/scala.
  • Experience in building and implementing scalable cloud based applications using AWS and GCP.
  • Performed data Extractions using Dataflow streaming from Kafka
  • Developed Spark code using Pyspark and Spark-SQL/Streaming for faster testing and processing of data.
  • Extensively used AWS glue for data cleansing and structuring.
  • Created a de-normalized BigQuery Schema for analytical and reporting requirements.
  • Historical data load to Cloud Storage using Hadoop utilities and load to BigQuery using BQ tools
  • Using Python created batch scripts to automate the ETL scripts runs every hour.
  • Developed and automated the frequently using scala and spark using CD/CI tools.
  • Used Bitbucket/gitlab/github vastly for the code deployment and storage.
  • Worked on installing, configuring Docker containers, Images for webservers.
  • Created Docker images for the applications, worked on Docker container snapshots, removed images and also virtualized servers in Docker.
  • Worked and created ACID transactions on MySQL, DynamoDB, MongoDB
  • Work with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Used AWS CLI to automated backups for data stores in S3 buckets.
  • Created API using Fast api, flask and swagger for the AI/ML modules like OCR(Image to text extraction).
  • Created models for Facial recognition using MTCNN, TensorFlow and Keras by capturing the images using the opencv python library.
  • Worked on data visualization packages in python.

Lead Data Engineer

Entrans IO
CHENNAI
12.2020 - 05.2021
  • Created new decorators on python for adding new functionality without changing the existing functions.
  • Develop simple to complex Spark Jobs using Hive to cleanse and load various data streams.
  • Experienced in designing test plans and test cases, verifying and validating web based applications.
  • Used test driven approach for developing the application and implemented the unit tests using Python unit test framework.

Sr. Hadoop Developer

IQVIA
Plymouth Meeting
07.2018 - 03.2019
  • Designing ETL processes, which transformed and pulled to data warehouse from Oracle into hadoop along with logic that took a complicated process, which involved business logic calculations serving as a reporting source.
  • Develop simple to complex Spark Jobs using Hive to cleanse and load mainframe data.
  • Involved in creating design patterns for the business model like Point in Time snapshots for the insurance business.
  • Handle importing of data from various data sources; perform transformations using Hive, impala, MapReduce, load data into HDFS and extract the data from Oracle into HDFS using Sqoop.
  • Handled network architecture, application troubleshooting, application design, and Systems Analyst duties
  • Used Squirrel sql client tool to built the frequently used queries in Hive and impala.
  • Export the analyzed data from hive tables to oracle DB2 databases using Sqoop for visualization and to generate reports for the BI team.
  • Retrieved data and analyzed it using spark/scala. Used Insomnia for transferring the tables from and to Oracle and Hadoop.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used YARN to submit/run the spark jobs in cluster mode for better performance.
  • Extensively used Hive/impala for data cleansing.

Sr. Hadoop Developer

American Family Insurance
Madison
04.2016 - 06.2018
  • Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in MapReduce way.
  • Developed and automated the frequently using python and spark.
  • Used Bitbucket and Git vastly for the code deployment and storage.
  • Involved in creating sql scripts in spark to replace the hive queries for better performance improvement.
  • Use Hive/impala to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Created hive tables with ORC and parquet format for better query optimization.
  • Used Unix bash scripts to validate the files from Unix to HDFS file systems.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Created and stored data in unstructured database like Hbase.
  • Scheduled the scripts to cleanse and load the data into hive-partitioned tables using Autosys and Control M
  • Work with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Sr. Hadoop Developer

Edward Jones
Maryland Heights
04.2015 - 10.2016
  • Evaluate business requirements and prepare detailed specifications that follow project guidelines required to develop written programs.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyze large amounts of data sets to determine optimal way to aggregate and report on it.
  • Designing ETL processes, which transformed and pulled to data warehouse from DB2 into SQL 2008R2 and SQL 2012 along with logic that took a complicated process, which involved business logic calculations serving as a reporting source.
  • Develop simple to complex MapReduce Jobs using Hive to cleanse and load mainframe data
  • Handle importing of data from various data sources; perform transformations using Hive, impala, MapReduce, load data into HDFS and extract the data from MySQL into HDFS using Sqoop.
  • Handled network architecture, application troubleshooting, application design, and Systems Analyst duties
  • Export the analyzed data from hive tables to mainframe DB2 databases using Sqoop for visualization and to generate reports for the BI team.
  • Retrieved data and analyzed it using spark/scala. Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Spark API over Hadoop YARN to perform analytics on data in Hive. Extensively used Hive/impala for data cleansing.
  • Create partitioned tables in Hive/impala. Manage and review Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in MapReduce way.
  • Administered the table access for various group of people using Sentry application in Cloudera.
  • General system monitoring for machines running a Hadoop Cluster through Cloudera Manager (for monitoring parameters like disk space, disk partitions,etc).

Mainframe Developer

Tata Consultancy Services
Chennai
01.2008 - 03.2015
  • Involved in each phase of SDLC from analysis through Install by participating and closely reviewing every single phase with the teams
  • Learnt IDMS and Involved in understanding and migrating IDMS to DB2.
  • Provided timely defect fixes during UAT phase to modules developed by others also.
  • Developed and delivered all the components with minimal defects.
  • Provided solutions for complex situations raised during User testing phase.
  • Providing expertise in analyzing the complex requirements
  • Attend client meetings to discuss the project status and held review meetings.
  • Worked in Incident and problem management.
  • Handled high severity issues 1 and 2 and provided quick resolution and performed postmortem reports.
  • Extensively worked in change management to create the process of approving the changes going into production without impacting the systems.

Education

Bachelor of Science - Computer Science And Engineering

SACS MAVMM Engg College
Madura, India
04-2007

Skills

  • Python, PySpark
  • SQL, Oracle
  • Python Programming
  • PowerBI
  • Microsoft Azure Databricks, Data Factory
  • Azure Datalake, Logic Apps
  • Unix, python scripting
  • Data Warehousing
  • Data Migration
  • Big data technologies
  • SQL and Databases
  • Source and Version Control: Git, GitHub
  • Continuous integration and deployment
  • Autosys, Ctrl-M

Timeline

Technical Lead

Wipro
10.2022 - Current

Associate Architect/Lead Data Engineer

Oasys Cybernetics
06.2021 - 09.2022

Lead Data Engineer

Entrans IO
12.2020 - 05.2021

Sr. Hadoop Developer

IQVIA
07.2018 - 03.2019

Sr. Hadoop Developer

American Family Insurance
04.2016 - 06.2018

Sr. Hadoop Developer

Edward Jones
04.2015 - 10.2016

Mainframe Developer

Tata Consultancy Services
01.2008 - 03.2015

Bachelor of Science - Computer Science And Engineering

SACS MAVMM Engg College
Ashok Arun Kumar Selvaraj