- Working as Big Data and Cloud Architect with business partners, application product owners to understand detailed business requirement, and translating them to technical requirement.
- Designed data warehouse and Data Lake solution along with data processing Pipeline using PySpark, EMR,Glue Data catalogue ,Airflow and Athena. Entire cloud infrastructure created using Terraform.
- Performed Data Modeling on snowflake for transactional and analytics need .
- Involved on hands on development and configuration of data processing using Pyspark on EMR using Pycharm,EMR studio and Jupiter notebook
- Designed and Implemented the Data Migration from On Premise to Cloud
- Created various dashboards using cloud watch for monitoring various stats of the system .
- Implemented all PACCAR standard security policy in AWS Cloud including SSL security at application label.
- Developed event based triggering using S3, Lambda for executing campaigns for business teams
- Designed and developed data catalogue using Glue Crawlers and used them in on Athena for analytics.
- Worked and delivered critical on-premises to cloud modernization projects as part of AWS cloud migration journey.
- Automated infrastructure shutdown and startup of on demand systems using SSM which business could start and stop from there user screen.
- Automated AWS Key and password rotation using AWS Lambda,AWS Secretmanger.Cloudwatch event.
- Designed and developed data lake on AWS S3 and integrated data from various systems.
- Designed and Implemented the history Data Migration from On Premise database to Cloud using FineTuned customized scripts.
Key Projects:
On-premises to AWS Could Modernization:
- Redesigned and migrated a java spring boot based solution which is running on premise to AWS Cloud with Scala and Spark engine that gained 80% runtime for business campaigns and saved the cost of $250K per year on infrastructure..
- Completely replaced a vendor tool which runs on Cloudera Distribution to AWS Cloud with a new generic Framework created using Pyspark with proper data organization and techniques .. which saved $ 450K per year for the India business unit.
- Integrated Various on prem systems and portals with AWS Analytics System using API GATEWAY ,Lambda and DynamoDB
- Designing and Developing Data ingestion and Data processing pipelines using build Big Data algorithms in Spark, Python,Pyspark, Scala and Hadoop technologies.
- Data Modeling for Snowflake.