Has been part of Insurance domain project with the below
key projects
- Involvement in migration of Big-Data to Microsoft Azure
- Re-Writing legacy scripts in Perl,Powershell, Oracle SP's to Python
- Implementing Advance python concepts to make code base future proof
- Re-Structuring of entire codebase for simplicity, & avoiding repetitive activities across.
- Implementing Testing frameworks for few projects to reduce Manual testings.
Please find below for the various Key technical Aspects
Python
Development
- Extensive usage of Data structures
- Implementing Advanced areas such as Generators, Iterators, Map, Lambda, Filter etc.
- Objected Oriented areas in python using Classes
Implementation of Design Patterns
- Usage of Multithreading concepts
- Establishing connectivity to various Databases such as Oracle, AzSynapse, Hive..etc.
Coding Standards
- Pyflakes, PEP-8 Guidelines
Testing Framework
- Unit Test, Pytest, Automated Test using Tox, Test Coverage
Documentation
- Python Docs implemented using Sphinx framework
Data analysis
- Numpy, Pandas, Pyspark, HDF5
Python & Big Data
- Web API calls to invoke HDFS , Hive etc.. for performing various operations
- Pyspark Invocation, Spark SQL
- Development in Pyspark using RDD's
- Extensive usage of Hive queries.
Microsoft Azure
Azure Data Factory
- Implementing Modern Data Warehouse concepts using the ADF Activities.
- Handling data from various types of source & Destination Systems
- Scheduling & Monitoring Pipelines
- Basic level Transformation using the provided Activities
- Creation of various datasets , Linked Services, for numerous requirements
- Establishing connectivity securely via Key Vault
ADLS Gen 2
- Identifying the Data model based on the various storage mechanism available on ADLSGen2.
- Usage of Temp tables for Session level usages
- Establishing connectivity with python via Web API call & performing operations in ADLS file System
- Handling the data in Azure storage Explorer
Azure Synapse
- Identifying the structure of synapse
- Design of Replicated tables in Synapse
- Data validation & Verification using synapse queries
Azure Databricks
- Data manipulations & complex transformation using Databricks Notebooks
- Development of notebooks in PySpark
- Establishing connectivity with ADLS, ADF from Databricks
- Creating of Tokens for triggering Notebooks from OnPrem python Wrappers
SDLC
- Adherence to Agile process, Involvement in Scrum calls with Product Owners
- Flexible in usage of Azure Boards
- Understanding the process in Azure Pipelines & AzDo Deployments
- Basic Operation of code checkout, Checkin & Merge related activities
- Storing deployment Artifacts