Overview
Work History
Education
Skills
Timeline
Generic

Gautham Dubba

Celina

Overview

5
5
years of professional experience

Work History

Big Data Engineer

Microsoft Corporation
11.2023 - Current
  • Developed ETL jobs to extract, transform, and load data from various sources into the target system.
  • Automated deployment processes for deploying applications onto different cloud environments.
  • Integrated existing systems with new platforms such as AWS S3 or Azure Blob Storage.
  • Maintained security protocols around sensitive information stored in the Big Data environment.
  • Created and maintained Hadoop clusters with multiple nodes.
  • Optimized queries on distributed systems like Hive and Impala or Presto to improve performance of analytics tasks.
  • Implemented real-time streaming applications using Kafka Streams or Spark Streaming.

Data Engineer

Salesforce
03.2022 - 10.2023
  • Created stored procedures for automating periodic tasks in SQL Server.
  • Developed Python scripts for extracting data from web services API's and loading into databases.
  • Analyzed user requirements, designed and developed ETL processes to load enterprise data into the Data Warehouse.
  • Automated data quality checks and error handling processes to ensure the integrity and reliability of datasets.
  • Collaborated with cross-functional teams to gather requirements and translate business needs into technical specifications for data solutions.
  • Documented data architecture designs and changes, ensuring knowledge transfer and system maintainability.
  • Managed version control and deployment of data applications using Git, Docker, and Jenkins.

Big Data Engineer

Oracle Corporation
09.2020 - 10.2021
  • Developed ETL jobs to extract, transform, and load data from various sources into the target system.
  • Automated deployment processes for deploying applications onto different cloud environments.
  • Built data pipelines using Apache Spark to process large amounts of structured and unstructured data.
  • Integrated existing systems with new platforms such as AWS S3 or Azure Blob Storage.
  • Deployed machine learning models on Big Data platforms for predictive analysis applications.

Data Engineer Intern

Oracle Corporation
03.2020 - 08.2020
  • Performed quality assurance checks on incoming datasets, identifying and resolving issues with accuracy and precision.
  • Maintained version control systems such as Git or SVN for tracking changes in codebase over time.
  • Collaborated with team members across departments to ensure accurate data capture and reporting needs are met.

Education

Master of Science - Computer Information Systems

Trine University
USA
07-2023

Bachelor of Science - Computer Science

Gayatri Vidya Parishad College of Engineering
India
07-2021

Skills

  • ETL development
  • Data pipeline creation
  • Cloud deployment
  • Hadoop management
  • Query optimization
  • Real-time processing
  • Data quality assurance
  • Java programming
  • Data Inference
  • Analytical skills
  • Cloud computing
  • Amazon redshift
  • Hadoop ecosystem
  • Database security
  • Data analytics
  • Big data analytics
  • Google cloud platform
  • Python programming
  • NoSQL databases
  • Data lineage
  • MapReduce development
  • Apache Spark
  • Docker Software
  • API development
  • Machine learning
  • Data visualization

Timeline

Big Data Engineer

Microsoft Corporation
11.2023 - Current

Data Engineer

Salesforce
03.2022 - 10.2023

Big Data Engineer

Oracle Corporation
09.2020 - 10.2021

Data Engineer Intern

Oracle Corporation
03.2020 - 08.2020

Master of Science - Computer Information Systems

Trine University

Bachelor of Science - Computer Science

Gayatri Vidya Parishad College of Engineering
Gautham Dubba