Electronic Music Production
Deepak Marathe is a Software Professional with 13 years of Experience as a Leader for Globally Distributed Remote Teams and as an Individual Contributor in Big Data Platform Engineering and Analytics, Polyglot FullStack Software Development with Solid exposure to Cloud, Blockchain/Web3 and Generative AI.
Personal Website : https://deepakmarathe.netlify.app
● Invented a theorem in Data Engineering involving Consistency, Availability and Partitioning for a Data System that works on Polling to update the State with Dr. Shivnath Babu, CTO, Unravel Data Systems INC.
● Implemented Case-Insensitive Search on ElasticSearch for majority of Attributes in Gaming Domain.
● Designed and Delivered a DataLake product on top of Google Cloud BigTable using a Geographically Distributed, Remote team of 7 data engineers. Technologies : Python, Flask, Google Cloud Bigtable.
● Migrated Data Warehousing ETL Workflows from Hive-SQL to Spark using Python and Scala on AWS EMR/Azure DataBricks Infrastructure. Used Airflow/Google Cloud Composer/Azure Data Factory for Orchestration.
● Assisted in Data Migration from an On-Premise Data Platform onto AWS using Spark, Pig, MapReduce Technologies.
● Optimized the Computation Time on AWS Athena by distributing the processing using the ppss tool in python.
● Helped Migrate the Data from AWS S3 Data Lake to GCP BigQuery.
● Developed Distributed Application to perform Window Based Aggregation on Streaming Data using Flink using data from Kafka and Visualized using Influx Time Series Database and Grafana.
● Developed and Deployed a Web Application to host the Blogs using Django, Python Technologies on Heroku Infrastructure.
● Developed NanoCube - Scalable, Distributed, Spatio-Temporal DataCube Abstraction for Time Series Datasets along with an Event Based Framework for RealTime Visualization of Rolling, Additive Aggregation.
● RESTFul Microservices APIs for Distributed Multi-Dimensional Spatio Temporal Time Series DataCube supporting Ingestion, Search, Scan Operations on GDelt Event Dataset.
● Query Interface modeled on CNF implemented using Sparse BitSets.
● Point and Range Queries over time, all/subset of dimensions with Pagination for Results.
● Supports Interactive Aggregations over 10 years worth of gdelt data at current content generation pace using Dropwizard, Java, GCP, Docker, Git.
● MemoryMapped Files are used to efficiently retrieve data on disk, supporting 100 ms response time over 10 million data points, with a result page size of 100.
● InMemory Index Periodically Persisted to Disk/Cloud Storage Object.
● Distributed Processing for Machine Learning Production Pipelines Using Apache Beam on Google Cloud DataFlow, PubSub.
Worked on HTML, CSS, AJAX, JQuery, Reporting Tools like Jasper at iNAT Technologies located at NITK Campus, Surathkal.
Programming Languages : Java, Python, SQL, Bash, Go-lang, NodeJS, Scala, C, C, Ruby, Solidity
Generative AI : Prompt Engineering, Grounding, Vector Databases, Lang Chain
Electronic Music Production
DJ
Playing Guitar Instrumental
Vedic Astrology