Hadoop Ecosystem ,HDFS, Yarn,PySpark, Scala,Spark-Sql
Experienced Data Engineer with almost 6 years of expertise in designing and implementing scalable data pipelines and applications. Proficient in Google Cloud-based applications and Big Data technologies like GCP, Dataflow, Pubsub, Dataproc,Airflow, CI/CD, Advanced SQL, HDFS, YARN, and Pyspark. Passionate about learning new concepts and helping businesses succeed. Skilled in developing scalable data solutions for large enterprise data from multiple sources, including structured and semi-structured data. Adept in collaborating with cross-functional teams and delivering end-to-end data solutions. Developed a data pipeline using Data Lake that led to a client revenue increase of 19%.
Experience:
Experience :
Experience:
Key Achievements:
Proficient in Python
undefinedHadoop Ecosystem ,HDFS, Yarn,PySpark, Scala,Spark-Sql
Python,Shell Scripting ,LinuxJupyter notebooks,Spyder,Pycharm,Anaconda
Gcp,Cloud Dataflow, Big Query, Pub/ Sub, Cloud Shell,Dataproc, Cloud Functions
Mysql, Oracle,Sql Server
ADF,AZURE Data Lake,Azure Blob storage
Jupyter notebooks,Spyder,Pycharm,Anaconda
AWS Lambda,Kinesis Data Stream
Data Engineering with Google Cloud Professional Certificate from Coursera
Data Engineering with Google Cloud Professional Certificate from Coursera
Oracle Autonomous Database Cloud 2019 Certified Specialist
Oracle Cloud Infrastructure Developer 2020 Certified Associate
Oracle Cloud Infrastructure Foundations 2020 Certified Associate
Google Cloud Platform Big Data and Machine Learning Fundamentals
AI Genie Certified From Capgemini AI Academy.
Deep learning and neural networks Certification from Coursera
Partcipation Certification from Global Data Science Hackathon from Capgemini
Automation Engineer Practitioner Certification by Capgemini University
Customer Delight award from Capgemini
Data Science Hackathon - FinTech(https://www.credential.net/hz3y6qln)
Automation Foundation Level Certification by Capgemini University