Detail-oriented Data Engineer designs, develops and maintains highly scalable, secure and reliable data structures. Responsive expert experienced in monitoring database performance, troubleshooting issues and optimizing database environment. Possesses strong analytical skills, excellent problem-solving abilities, and deep understanding of database technologies and systems. Equally confident working independently and collaboratively as needed and utilizing excellent communication skills.
Overview
6
6
years of professional experience
1
1
Certification
Work History
Data Engineer
Buildertrend Solutions, Inc
12.2022 - 11.2023
Agile development (2 weeks sprints/ Iterations), Test Driven Development with automated CI/CD pipelines with
Git, Jenkins and ADO part of daily duties
Utilized Google Data Catalog and various Google Cloud APIs to oversee query performance and conduct billing-
related analysis for BigQuery usage, enhancing monitoring capabilities and optimizing data analysis workflows
Implemented data automation processes using DBT/ Data Fusion/ FiveTran which reduced the manual workload
by 80% into BigQuery
Also implemented Reverse ETL solutions using HighTouch
Spearheaded and implemented major project that transitioned live data sourced from Cloud Pub/Sub into
purpose-specific models for the first time in the data warehousing team
Experienced in Cloud Functions
Dataproc and Composer
Implemented robust access controls, data quality measures, and documentation protocols, resulting in
heightened data security, integrity, and compliance across multi-tiered projects using Atlan
Extensively used Python libraries for creating pipelines and developed multiple python scripts for various
automation jobs and configured Airflow for workflow management
Developed BigQuery authorized views, implementing row-level security and facilitating secure data exposure to
other teams
Developing and implementing data extraction processes to collect information from various source
Implement machine learning algorithms and models to enhance data analysis and prediction capabilities, using
tools such as TensorFlow and Scikit-learn
Identifying and implementing data optimization techniques to improve performance and efficiency of data
analysis workflows
Design and implement data governance policies and procedures to ensure data integrity and compliance,
including data classification, data retention
Data Platform Engineer
Spreetail, LLC
08.2021 - 10.2022
Extensively developed SSIS packages and Azure Data Factory pipelines, orchestrating seamless data transformations from diverse sources into the data warehouse
Engaged in daily scrum meetings to facilitate sprint progress discussions, actively contributing to the optimization and increased productivity of the scrum process
Worked on Microsoft Azure with Databricks to build pipelines using Azure Data Factory and moving the data into
Azure Data Lake Store and developed spark scripts on Azure HDInsight for Data Aggregation
Optimized the entire warehouse queries using performance tuning strategies which has reduced the runtime by
Used DAX queries to extract data from cubes and also utilized Power BI for implementing reporting solutions
Architected a data integration solution using DBT(Data Build Tool) and migrated pipelines which helped in cost savings by 20%
Proficient in crafting intricate SQL queries to extract, manipulate, and analyze data, optimizing database performance and enabling comprehensive insights
Utilized Octopus Deploy to streamline deployment processes, orchestrating efficient and dependable application rollouts across various environments
Developed and designed applications to source data from multiple downstream events using Kafka and then process it in Spark
Instrumental in the creation and proficient management of diverse database objects, encompassing tables, views, stored procedures, triggers, Common Table Expressions (CTEs), and temporary tables
Collaborate with cross-functional teams to understand business requirements and provide data-driven solutions
Senior Data Analyst | Data Engineer
08.2017 - 08.2021
Led multiple projects in close collaboration with business users, while also training and mentoring junior data
analysts
Implemented complex ETL pipelines from various sources using Pentaho Data Integration that has reduced
manual transformations by 80%
Initiated in implementing robust CDC strategies, ensuring accurate and efficient synchronization of data across
diverse systems, facilitating informed decision-making and maintaining data consistency across the organization
Collaborated cross-functionally with teams to identify key data requirements and translate them into effective
models, contributing to enhanced decision-making processes
Automated web data extraction and storage in databases, orchestrating scheduled API calls across diverse
websites using UI Path
Utilized UIPath Orchestrator for seamless deployment, monitoring, and management of
UIPath Robot automation activities
Owned and maintained all facets of Abby FlexiCapture, enabling automated data extraction from PDF documents
Developed and maintained robust data pipelines, performed data cleansing, and implemented efficient data
transformations using Python
Implemented a standardized documentation process, improving the efficiency of knowledge transfer across
departments and enhancing overall product usability
Developed Rest API using Python and integrated various data sources including JDBC, JSON, Spreadsheets and
text files
Developed and designed comprehensive reports and dashboards, enabling intuitive data visualization for
stakeholders
Identify and implement innovative data analysis techniques to improve data quality and accuracy, resulting in
more reliable insights and decision-making for the organization.
Education
Masters of Science - Computer Science
The University of Texas
05.2017
Bachelors of Technology - Computer Science
Jawaharlal Nehru Technological University
04.2015
Skills
Azure Stack
Google Cloud Platform
Hightouch
Microsoft SQL Server
MySQL
Data Build Tool (DBT)
SSIS
Pentaho
SQL
Scala
Python
Azure Dev Ops
Jira
Confluence
Git
Power BI
JReport
Octopus Deploy
Airflow
Performance Tuning
Visio
Erwin
Data Transformation
Jupyter Notebook
Database Development
User Profile
Certification
Microsoft Certified: Azure Data Fundamentals Credential ID : 946377117C15E0EA