· 11 Years IT experience with 6 years in Datawarehouse & ETL and 4 years of experience in Bigdata software stacks such as HDFS, Sqoop, Hive, Sparksql, Spark, Kafka, Nifi, Oozie and Cloud platform of AWS (S3, EC2) and Azure.
· Experience in Banking and Healthcare domain.
· Working on 600 node production cluster and 20 node development cluster distributed by Cloudera, with an incoming data of approx. 260GB per day in production cluster.
· Worked on data migration from databases such as oracle to Hadoop platform
· Good working knowledge of Spark core (RDD) and Spark SQL (Data frames). Experience in various Performance improvement and optimization techniques in spark
· Used Kibana as visualization by integrating with Elasticsearch
· Consumed Data from Kafka using Spark Streaming Context as micro batches to process it at the real time/In memory
· Good Experience in Spark DStream abstraction to read the streaming data from different Sources like files, sockets and Kafka to process the same using spark engine and store in HDFS
· Used Nifi Interface for creating, monitoring, controlling the data flows.
· Had good experience in developing ETL pipelines in Datalake
· Developed spark scripts to process the datasets in HDFS as storage layer.
· Developed Sqoop jobs to transfer data between DB2 and HDFS.
· Developed external Hive tables improvising the tuning options using functions such as Partitioning, Bucketing, Index,CBO for improvising performance and perform different types of joins on Hive tables and implementing Hive SerDe's like JSON and ORC
· Deep understanding on various methods to tune and optimize Hive performance.
· Developed HQL scripts for analysis of data in Hive tables.
· Good understanding of oozie workflow generation and scheduling coordinators.
· Strong Experience of Data Warehousing ETL concepts using OLAP, OLTP
· Strong working experience in Data Extraction, Transformations and Loading processes using DataStage (both server and parallel)
· Experienced in Agile and waterfall methodologies
ETL - IBM Datastage 85v,87v,91v,117v
· IBM certified solution developer Infosphere Datastage 8.0 and 8.5
· Certified in Oracle SQL Expert
· Acquired ITIL foundation certificate in IT service management
· Acquired banking certification in BFS -wealth management and Global consumer Banking
. Certified in Databricks Lakehouse Fundamentals .
· Certified in Microsoft Azure data fundamentals Az-900 , Microsoft Azure Data Fundamentals (DP-900) and Microsoft Azure Data Engineer Associate (DP-203).