Big Data Professional with over 2.5 years of experience in Product Development, Business Intelligence, Data Warehousing and Analytical Solutions for Life Sciences. Skilled in Big Data technologies like Hadoop, Hive, PySpark, and Google Cloud Platform(GCP). Strengths in Statistical Modelling and Machine Learning.
Ability to translate client business needs into complex designs and reusable logics. Experience in providing support in areas of data analysis using SQL and creating VBA driven excel dashboards.
Product Development
· Researched and investigated project requirements.
· Following the agile approach broke the work into Jira stories.
· Developed multiple reports for a product, including the pilot in PySpark and integrated them with Power BI based front end using Google Big Query.
· Designed and developed an initial portable configuration set-up module to aid the smooth deployment of the product across multiple clients.
· Used Git for the development and enhancement of the product reports.
· Optimized, Quality Assured and performed cost estimation of the reports for the largest retailer in NAM.
Compliance Report
· ETL unorganized license compliance data from multiple windows machines requiring complex rules implementation to extract meaning information using Python.
· Developed a generic language converter python module to execute PostGre SQL queries from Python interface to be reused outside of the project.
· Crunched the compliance data for KPI's using Python and pushed them to Postgres tables.
· Integrated the Postgres tables with the Power BI-based front end of the compliance report.
· Used NLP to obtain the compliance report insights.
· Automated the entire process from data collection to data crunching to pushing data in PostGre tables using crontab and scheduled it to run every fortnight.
· Created and developed the relevant logics to archive older compliance data.
Subject Matter Expert
Services
Python