Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

ABHIJITH U NAIR

Cochin,Kerala

Summary

Senior Data Engineer with deep expertise in real-time streaming and batch data pipelines, cloud-native platforms, and enterprise-scale data migration projects. Currently engaged in complex Databricks production migrations at Dana, specializing in Unity Catalog enablement and DevOps-driven Delta Live Table deployment frameworks. Track record of developing scalable data solutions using Azure Cloud Services, Databricks, and AWS services. Adept at integrating semantic knowledge graphs, orchestrating metadata cataloging, and delivering self-service analytics using cutting-edge open-source technologies. Demonstrated ability to empower development teams with efficient, test-friendly production workflows and drive data quality at scale.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Nest Digital Pvt Ltd
08.2022 - Current

Role- Develop, Explore and Deploy the Data Engineering , Data Science and Programming Capability using Cloud Platform, different Data Modelling paradigm and Database Technologies.

Key Responsibilities:

  • Collaborated with cross-functional teams to define technical requirements and deliver end-to-end data engineering solutions for both internal teams and end-user features.
  • Led the migration of sensitive, real-time production data pipelines to Unity Catalog-enabled Databricks Workspaces ensuring governance and scalability.
  • Reengineered legacy ETL workflows by identifying performance bottlenecks and applying Spark optimization strategies.
  • Enhanced system performance by architecting scalable data solutions capable of handling high-traffic enterprise workloads.
  • Developed robust database architectures supporting seamless dataset integration and rapid analytics enablement.
  • Built and deployed Delta Live Table (DLT) pipelines integrated with a DevOps strategy, isolating production and feature environments.
  • Designed and implemented custom APIs using Azure Functions for real-time data publishing and ingestion via Azure Event Hub.
  • Supported development teams by enabling sandbox environments while preserving production integrity.
  • Collaborated on complex ETL processes ensuring data consistency, integrity, and system resilience.
  • Worked across data platforms such as SNOW and Canvas for multi-platform processing.
  • Designed and implemented an enterprise-level Data Lakehouse as a unified source of truth.
  • Applied Medallion Architecture (Bronze, Silver, Gold layers) for structured data ingestion and processing.
  • Enabled efficient data cataloging for consumers using Amundsen, improving discoverability and governance.
  • Developed a semantic Knowledge Graph using ontology modeling for structured healthcare data.
  • Enabled auto-discovery of entities and relationships, supporting rich semantic search capabilities via SPARQL and GraphQL.

Cloud Platforms:

  • AWS:
    Built architecture with AWS Glue, Athena, and EKS for ingestion, processing, and storage on S3.
    Deployed Apache Spark on EKS and integrated APIs using AWS AppSync, Neptune, and Lambda.
  • Azure:
    Streamed IoT telemetry data via Azure Event Hub into Blob Storage for processing.

ETL & Data Processing:

  • Apache Spark:
    Built ETL jobs for batch and real-time ingestion from on-prem DB and Kafka (including CDC via Debezium).
    Stored transformed data in Delta format following Medallion Architecture.
  • Databricks:
    Developed Delta Live Table pipelines to process streaming and batch workloads in a unified framework.

Data Cataloging:

  • Amundsen:
    Developed metadata extraction scripts for tables, dashboards, and charts.
    Customized backend services to enhance deployment and search experience.

Knowledge Graphs:

  • Neo4j:
    Built property graph models for semantic querying and exploration.
  • Stardog:
    Used ontology-based modeling for RDF graph generation with federated data sources.
    Enabled querying via SPARQL and GraphQL for enterprise-scale KG solutions.

Visualization & Analytics:

  • Dremio: Offloaded expensive data warehouse use cases and enabled self-service analytics on open data lakehouse.
  • Apache Superset: Developed dashboards for internal reporting and interactive data exploration.

Senior Executive Representative – Data Quality

TransUnion CIBIL
11.2019 - 04.2022

Role-Develop Python Applications for Processing and Gaining Insight over Data Using PySpark , Hive SQL.

Python & PySpark Script Development

  • Developed standalone Python applications using standard libraries (os, csv, sys) to analyze and generate insights on the quality of member-submitted data.
  • Integrated data profiling functionalities into PySpark-based applications to detect anomalies, duplicates, and data fragmentation.
  • Automated routine analysis tasks using Linux shell scripting and scheduled jobs via Crontab.
  • Designed PySpark applications for data hashing and decryption, employing libraries such as hashlib and cryptography to handle sensitive mobile data securely.
  • Built regex-driven extraction tools (using the re module) to parse structured/unstructured data and validate it against internal quality benchmarks, reporting deviations to relevant teams.

Data Analysis & Visualization (PySpark SQL, Hive, Tableau, Splunk)

  • Conducted in-depth data quality analysis based on team observations and member expectations using PySpark SQL and Hive queries, driving improvement in data submissions.
  • Developed and maintained interactive Tableau dashboards for insight generation, used by both operational teams and senior management.
  • Created and supported Splunk queries for extracting real-time insights into commercial and customer data, analyzing patterns related to inquiries and data quality indicators.

Stakeholder Collaboration & Data Governance

  • Actively collaborated with internal teams to improve data submission workflows, ensuring alignment with quality standards.
  • Communicated insights and actionable recommendations to members regularly, helping reduce duplication and anomalies, thus minimizing disputes with the data bureau.

Technical Support Service – Java Programming & Systems Admin

CDAC, Kolkata (via T&M Services)
10.2018 - 01.2019

Role- Java Programming and System Administration

  • Developed Java applications using Hibernate and MQTT.
  • Maintained networking and virtual machine environments (Oracle VM) for test operations.

Technical Support Engineer L1

WNS Global Services (via Comnet Pvt Ltd)
10.2016 - 12.2017

Role : Networking

  • Performed active routing state checks and identified the network protocols (e.g., OSPF, BGP) used across routers and switches to ensure proper configuration and communication.
  • Monitored performance metrics between routers and switches within the data center (DC) to proactively identify latency, congestion, or packet loss issues.
  • Oversaw connectivity health between routers and external service provider links, ensuring stable and reliable data transmission.
  • Conducted firewall monitoring and diagnostics for Juniper and Fortinet firewalls, identifying security policy violations, access issues, and routing conflicts.
  • Coordinated with service providers and DC management by raising and tracking tickets for active link failures, ensuring timely issue resolution and service continuity.

Education

PG-Diploma - Big Data Analytics (PG-DBDA)

CDAC
Kolkata
07.2018

B.E. - Computer Science Engineering

Anna University
Coimbatore
01.2015

HSC -

Maharashtra State Board
Kalyan
01.2010

SSC -

Maharashtra State Board
Kalyan
01.2008

Skills

  • ETL development and Machine Learning
  • Big data processing (Databricks and Apache Spark)
  • Kafka streaming (Azure Eventhub and Apache Kafka)
  • Data pipeline design (Delta Live Pipelines Streaming and Batching)
  • Data modeling
  • API development
  • Cloud Platforms: , AWS (Glue, Athena, EKS, AppSync, S3, Neptune)
  • Languages: Python, Java, R, C, C
  • Data Catalog & Visualization: Unity Catalog, Amundsen, Apache Superset, Tableau, Splunk
  • Graph & Semantic DBs: Neo4j, Stardog
  • DevOps & OS: Azure DevOps, Linux (RHEL), Crontab, Windows Server
  • Web Tech & Frameworks: Django, Flask, JavaScript
  • Databases: Oracle, MySQL, Hive

Certification

  • AWS Certified Solutions Architect – Associate
  • Microsoft Certified: Azure Data Fundamentals

Timeline

Senior Data Engineer

Nest Digital Pvt Ltd
08.2022 - Current

Senior Executive Representative – Data Quality

TransUnion CIBIL
11.2019 - 04.2022

Technical Support Service – Java Programming & Systems Admin

CDAC, Kolkata (via T&M Services)
10.2018 - 01.2019

Technical Support Engineer L1

WNS Global Services (via Comnet Pvt Ltd)
10.2016 - 12.2017

PG-Diploma - Big Data Analytics (PG-DBDA)

CDAC

B.E. - Computer Science Engineering

Anna University

HSC -

Maharashtra State Board

SSC -

Maharashtra State Board
ABHIJITH U NAIR