Summary
Overview
Work History
Education
Skills
Certification
Additional Information
Interests
Work Availability
Timeline
Generic
Deepak Marathe

Deepak Marathe

Software Professional
Bangalore

Summary

Deepak Marathe is a Software Professional with 13 years of Experience as a Leader for Globally Distributed Remote Teams and as an Individual Contributor in Big Data Platform Engineering and Analytics, Polyglot FullStack Software Development with Solid exposure to Cloud, Blockchain/Web3 and Generative AI.

Personal Website : https://deepakmarathe.netlify.app

Overview

16
16
years of professional experience
4
4
years of post-secondary education
8
8
Certifications
5
5
Languages

Work History

Independent Consultant

Jnyana
Bangalore
02.2019 - Current
  • Unravel Data/USA/Remote : Chief Architect, Observability.
  • A5 Labs/USA/Remote : Product Owner, Bonus Engine, Gaming.
  • Indeed/USA/Remote : Lead, Big Data Migration.
  • Taiyō/USA/Remote : Engineering Leader, Big Data and Backend.


● Invented a theorem in Data Engineering involving Consistency, Availability and Partitioning for a Data System that works on Polling to update the State with Dr. Shivnath Babu, CTO, Unravel Data Systems INC.

● Implemented Case-Insensitive Search on ElasticSearch for majority of Attributes in Gaming Domain.

● Designed and Delivered a DataLake product on top of Google Cloud BigTable using a Geographically Distributed, Remote team of 7 data engineers. Technologies : Python, Flask, Google Cloud Bigtable.

● Migrated Data Warehousing ETL Workflows from Hive-SQL to Spark using Python and Scala on AWS EMR/Azure DataBricks Infrastructure. Used Airflow/Google Cloud Composer/Azure Data Factory for Orchestration.

● Assisted in Data Migration from an On-Premise Data Platform onto AWS using Spark, Pig, MapReduce Technologies.

● Optimized the Computation Time on AWS Athena by distributing the processing using the ppss tool in python.

● Helped Migrate the Data from AWS S3 Data Lake to GCP BigQuery.

● Developed Distributed Application to perform Window Based Aggregation on Streaming Data using Flink using data from Kafka and Visualized using Influx Time Series Database and Grafana.

● Developed and Deployed a Web Application to host the Blogs using Django, Python Technologies on Heroku Infrastructure.

● Developed NanoCube - Scalable, Distributed, Spatio-Temporal DataCube Abstraction for Time Series Datasets along with an Event Based Framework for RealTime Visualization of Rolling, Additive Aggregation.

● RESTFul Microservices APIs for Distributed Multi-Dimensional Spatio Temporal Time Series DataCube supporting Ingestion, Search, Scan Operations on GDelt Event Dataset.

● Query Interface modeled on CNF implemented using Sparse BitSets.
● Point and Range Queries over time, all/subset of dimensions with Pagination for Results.

● Supports Interactive Aggregations over 10 years worth of gdelt data at current content generation pace using Dropwizard, Java, GCP, Docker, Git.
● MemoryMapped Files are used to efficiently retrieve data on disk, supporting 100 ms response time over 10 million data points, with a result page size of 100.
● InMemory Index Periodically Persisted to Disk/Cloud Storage Object.
● Distributed Processing for Machine Learning Production Pipelines Using Apache Beam on Google Cloud DataFlow, PubSub.

Senior Software Engineer

Rippling
12.2021 - 05.2022
  • Developed APIs for Reporting, Analytics on Python, Django, MongoDB Tech Stack.
  • Improved Code Coverage to 95% by Systematic Refactoring to make the modules Unit Testable.
  • Provided support on Slack, 24*7.
  • I quit my full time employment to take care of my Health concerns.

Staff Software Engineer

Quartic.ai
Bangalore
11.2019 - 03.2020
  • Optimized Distributed Rule Engine for Anomaly Detection by 8X. Tech Stack : Kafka, Drools, Scala, Spark,Postgres.
  • Improved Code Coverage to 90% by Systematic Refactoring to make the modules Unit Testable.
  • Implemented Data Retention Policies on ElasticSearch using Python, Django. Used Airflow for Orchestration.
  • Onset of Covid affected my Full-Time Employment with Quartic.

Lead Software Engineer

Halodoc
Bangalore
12.2018 - 12.2019
  • Developed and Deployed Features in the Halodoc Backend System
  • Developed RESTful Microservices(Java, Dropwizard, Hibernate), Domain modeling on MySql, API development (Go-Lang) and deployment on Serverless Architecture (AWS Lambda).
  • Doctor Referral - Domain Modeling, implementation(Java, DropWizard), Staging and Production deployment(AWS). Objective was to enable patients to have a second opinion with other doctors or the ability of consulting doctors to refer the patient to a specialist doctor
  • Recent Doctors - For the users that have consulted with doctors via tele-consultation, recent doctors have to be shown to the user. The objective was to increase engagement with doctors. Used Go-Lang with AWS Lambda to generate the recent doctor list and store it as users' attributes.
  • Doctor Categories - Grouping the available doctors with categories - helps the user of the app search doctors in relevant categories.
  • Actively involved in debugging and resolving bugs in production and provide support 24*7.

Product Engineer / SDE

GO-JEK Indonesia
Bangalore
12.2016 - 12.2018
  • Worked towards building a Self-Serve Data Platform scaling to Millions per day of Transactional data in an Agile Environment adhering to Pair Programming and Test Driven Development.
  • Framework for Data Consumption from Kafka, filtering based on data, and relaying it to pluggable sinks[console, database, http server] and evolved to plug dynamic sinks based on application Configuration from Consul. Fault tolerance and recovery using retry with Exponential Backoff semantics. Monitoring using Datadog/Statsd/JMX. Data archival to cold storage(GCS/Secor) and analysis(Spark/Zeppelin).
  • Streaming Data Aggregation using Kafka, Flink/Yarn, Influx, Grafana. Creation and management of HA Flink Cluster on Google DataProc Service. Remote submission/launch of data aggregators on Fink/Yarn clusters.
  • Microservices: Customer Wallet Suspension Service for Fraud Java/Spark, consul/HashiCorp for Configuration, Gradle as build tool, GitLab as version control and CI/CD. Used Hystrix dashboard/Turbine for Latency and Fault Tolerance, Guava rate limiter for rate limiting with NewRelic for Instrumentation, Datadog for monitoring and slack/pagerduty for alerting.

Senior Data Engineer

Intuit
Bangalore
04.2015 - 11.2016
  • Learned and Contributed to major projects in the IDEA BU. Ensured Code Quality through Unit Tests, Functional Tests, Code Coverage. Developed Voice Capabilities to Intuit's flagship product - QuickBooks Online using Amazon Voice Service/Alexa API on Amazon Echo hardware.
  • Distributed Configuration Management Service : Client using Archaius from Netflix, Guice dependency injection, testable code- code coverage, unit test, integration test, mocking. REST Server for configuration service using AWS-EC2, Dropwizard framework, AWS-DynamoDB.
  • Dragline: data copy tool from HDFS/Hive to Vertica : Feature implementation – configuration validator, logging improvements, bug fixes, code refactor for code quality. Kerberos, HP Vertica. Unit/Functional Tests for dragline: Improved code coverage numbers from 20% to 80%. CI/CD for the project using Jenkins + Sonar.
  • Test Framework - API to generate Mock data for testing : Data Source and Data Sink abstraction, extensible/pluggable data sources (Local FS, HDFS, S3, Vertica, Hive). Random data preparation using schema definition supplied via configuration and pluggable schema adapters. Support for Vertica data types.
  • Sangria Console: Web Application for admin to blacklist/whitelist EINs o developed using Spring java + hibernate ORM tool. App deployed on tomcat. SSO integration and login module.
  • Hackathon - develop speech capabilities to Intuit's flagship product - QuickBooks-Online using amazon voice service API to run QBO skill on Amazon Echo device. Hosted a REST server in AWS using Elastic Beanstalk.
  • Develop plugins to Intellij IDEA IDE - https://github.com/deepakmarathe/EditorOpenFilesCountPlugin.
  • Home Audio System using Raspberry Pi : install VLC player, use LUA HTTP interface to control the home theater connected to raspberry pi, using a handheld device which has VLC player installed in it. (VLC remote).

Senior Data Engineer

InMobi
Bangalore
03.2012 - 03.2015
  • Developed ETL Pipelines for Batch Systems.
  • Helped the team scale from handling a few GB of data to a few hundred TBs contributing to a user base of hundreds of million.
  • Built visualization tools for easy onboarding, analyzed large amounts of data to get insights on user patterns and helped business by building products atop those.
  • Data Ingestion and Curation: Built, deployed and managed data pipelines to ingest data from multiple sources. Used HBase/HDFS for storage, Thrift-LZO for data modeling, and MR/Pig for ETL. Also did some ad hoc analysis using Pig and Hive.
  • Augmented and automated data pipelines using in-house workflow scheduler (now Apache Falcon) built on top of Oozie). Configured a dashboard to collect and monitor metrics using Grafana. Was on pager duty to maintain SLA of service.
  • Analytics and Products: Understanding user interactions using data analysis- Improved ability to target ads by extracting user visit patterns and segregating users by devising scoring mechanisms to infer key metrics like Daily Active Users, Monthly Active Users over large data sets. Evolved it to a framework for custom Syntax Validations and Rule Based Classification.
  • User Targeting suite: Interest targeting, Retargeting, Segment/Persona targeting) - Implemented user classification into various Personas - depending on business decisions. Addressed challenges involved in User Profile Creation, Shipping to Local Data Centers, and Online-Cache population for ad serving.
  • Extra : Participated in Hackathon conducted by InMobi: Developed a Data Pipeline Designer to address the needs of product analysts. Self-Serve is a Web based visual data flow designer capable of performing data discovery and code generation for apache pig. Participated in InMobi sports day and won Runner up trophy in Table Tennis men’s doubles category.


System Software Engineer

Hewlett Packard, STSD
Bangalore
08.2010 - 03.2012
  • Application for Manageability of servers, named Replication Manager - Active/Passive Management of HP storage servers/ control the replication. Web Application, Java/Spring.
  • Part of the Open Storage (OST) plug-in team : OST plug-in implementation for HP D2D storage servers, to work with Symantec NetBackup solution, C/C++ - Integration of client side deduplication.
  • Setup and Maintained Build/Release system and Automation for HP OST Software. Hudson/Jenkins as the CI/CD tool for build automation. Builds automation on windows using visual studio command line utilities, ant, with code coverage tools integrated. Linux – RHEL, Suse build automation using make utility.
  • In charge of maintaining thread related features and common code base.

Academic Projects

NITK
Surathkal
06.2007 - 06.2009
  • Designed an Animation Package (in Java language) as part of a Computer Graphics project in the 4th semester.
  • Designed a project on database technologies in java language as a part of Database Systems, comprising basic functionalities of a typical database system.
  • Worked on a mobile application mBank to mobilize the banking process with the help of j2ME/Android.
  • Worked on the Web Application for Online Chess Club as a part of my curriculum project.
  • Designed ERP software for Adlabs FotoFast Client.


Internship Student

Tavant Technologies
Bangalore
05.2009 - 06.2009
  • Successfully completed a project on web-technologies involving Struts framework (MVC) at Tavant Technologies, Bangalore.


Internship Student

iNAT Technologies
Surathkal
05.2009 - 06.2009

Worked on HTML, CSS, AJAX, JQuery, Reporting Tools like Jasper at iNAT Technologies located at NITK Campus, Surathkal.

Education

B.Tech - Information Technology

NITK
Surathkal, Karnataka
06.2006 - 06.2010

Skills

    Programming Languages : Java, Python, SQL, Bash, Go-lang, NodeJS, Scala, C, C, Ruby, Solidity

undefined

Certification

Generative AI : Prompt Engineering, Grounding, Vector Databases, Lang Chain

Additional Information

  • PERSONAL INFORMATION
  • Name : Deepak Marathe
  • Phone Number : +918088936848
  • E-mail : dpkmarathe@gmail.com
  • Date of birth : 08-Dec-1988
  • LinkedIn : https://www.linkedin.com/in/deepak-marathe

Interests

Electronic Music Production

DJ

Playing Guitar Instrumental

Vedic Astrology

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Senior Software Engineer

Rippling
12.2021 - 05.2022

Staff Software Engineer

Quartic.ai
11.2019 - 03.2020

Independent Consultant

Jnyana
02.2019 - Current

Lead Software Engineer

Halodoc
12.2018 - 12.2019

Product Engineer / SDE

GO-JEK Indonesia
12.2016 - 12.2018

Senior Data Engineer

Intuit
04.2015 - 11.2016

Senior Data Engineer

InMobi
03.2012 - 03.2015

System Software Engineer

Hewlett Packard, STSD
08.2010 - 03.2012

Internship Student

Tavant Technologies
05.2009 - 06.2009

Internship Student

iNAT Technologies
05.2009 - 06.2009

Academic Projects

NITK
06.2007 - 06.2009

B.Tech - Information Technology

NITK
06.2006 - 06.2010
Deepak MaratheSoftware Professional