Specializing in ETL pipeline for big data project and data analysis. Experienced with all stages of the development cycle for complex data loads and predictive analysis projects.
Big data framework: Spark, Snowflake, Kafka, Hadoop, Databricks, NumPy, SciPy, Pandas and Matplotlib
CICD Tools: Jenkins, Git and Docker
Cloud: Azure, AWS
Schedulers: Airflow
Database: Postgress, Snowflake, HBase, Hive and Cassandra
Big data filesystem: HDFS, Azure Blob3 and Amazon S3
Big data framework: Spark, Snowflake, Kafka, Hadoop, Databricks, NumPy, SciPy, Pandas and Matplotlib
Java, Software Testing and SQL