Summary
Overview
Work History
Education
Skills
Websites
Certification
Languages
Passions
Leadership Extracurricular Activities
Accomplishments
Timeline
Generic

ANURAG LOHAR

Aurangabad

Summary

Accomplished Senior Software Engineer at NucleusTeq Consulting, specializing in big data processing and cloud solutions. Expert in Legacy, Databricks and AWS, I optimized Spark workloads for 30% faster processing while enhancing data governance. A collaborative problem-solver, I successfully integrated diverse data sources, driving significant operational efficiencies.

Overview

11
11
years of professional experience
1
1
Certification

Work History

Senior Software Engineer

NucleusTeq Consulting Pvt. Ltd.
Indore
07.2023 - Current
  • Designed Databricks Lakehouse on AWS with Amazon S3 and Delta Lake for enhanced data management.
  • Built scalable batch and streaming pipelines using PySpark and Spark SQL, processing millions daily.
  • Orchestrated comprehensive workflows via Databricks Jobs and Apache Airflow (MWAA).
  • Integrated Databricks with AWS Glue Data Catalog, Redshift, and Athena to enhance analytics capabilities.
  • Implemented Delta Live Tables (DLT) with stringent data quality checks and continuous monitoring.
  • Optimized Spark workloads through partitioning, caching, broadcast joins, and Z-Ordering for 30–40% faster processing.
  • Monitored pipelines using CloudWatch and Databricks metrics to maintain high availability.
  • Secured access via AWS IAM roles, Unity Catalog, and S3 bucket policies.

Consultant

Gyansys Infotech Pvt. Ltd
Bangalore
05.2021 - 03.2023
  • Executed AWS-based Cloudera migration, significantly reducing operational overhead.
  • Integrated 10 data sources with Kafka and PySpark to enhance data accessibility.
  • Automated workflows, achieving a 20% reduction in monthly manual effort.
  • Designed efficient storage mechanisms for large-scale data systems.
  • Delivered BI-ready views for Tableau utilizing SQL and Google Analytics API.
  • Collaborated on infrastructure planning to align with client business objectives.
  • Developed ML models for business forecasting, generating actionable insights using Python and SQL.

Hadoop /Big Data Consultant

San Information
Hyderabad
07.2015 - 05.2021
  • Executed POCs to benchmark throughput across multiple Hadoop distributions.
  • Established secure Hadoop clusters with MIT Kerberos for real-time and batch analytics.
  • Delivered L1/L2 support for Hadoop, Spark, and Hive systems.
  • Migrated data infrastructure to cloud platforms to improve scalability.
  • Employed PySpark and Hive for live and batch data analysis and reporting.
  • Identified data patterns resulting in 15% reduction in inventory costs.
  • Optimized cluster performance through OS and YARN configuration tuning.

Education

Post Graduation - Data Science and Business Analytics

University of Texas
Austin, Texas
07.2022

Bachelor of Computer Application - Science

MGM's Dr. G. Y. Patharikar College of CS & IT
College Station, Texas
04.2015

Skills

  • Databricks workspace and jobs
  • Delta live tables and Unity Catalog
  • Big data processing
  • Apache Spark and PySpark
  • Spark SQL
  • Delta Lake and Amazon S3
  • Medallion architecture
  • AWS Glue and Redshift
  • Athena and Lambda
  • CloudWatch and IAM
  • Streaming technologies
  • Spark structured streaming
  • Amazon MSK (Kafka) and Kinesis
  • Programming languages
  • Python and SQL
  • Orchestration tools
  • Databricks workflows and Apache Airflow (MWAA)
  • Data modeling techniques
  • Dimensional modeling and SCD Type 1 & 2
  • DevOps practices
  • Git, CI/CD, Terraform, CloudFormation
  • Data governance strategies
  • Unity Catalog, IAM, encryption, data masking

Certification

• Python, Code with Mosh
• SQL, Code with Mosh
• Pursuing Full Stack & Data Engineering certification from Scaler

Languages

Proficient in English, Hindi, Marathi and Gujarathi

Passions

  • Bike Riding - Enthusiastic about long-distance riding and actively participate in local Riding events and Summits.
  • Travel Photography - Enjoy capturing unique landscapes and cultures while exploring new countries and environments.

Leadership Extracurricular Activities

  • Interviewed candidates for Data Engineering and Data Science roles
  • Campus recruitment panelist for NucleusTeq
  • Conducted onboarding sessions and workflow training for new hires
  • Designed and implemented new operational and development processes

Accomplishments

NucleusTeq Guiding Star

Timeline

Senior Software Engineer

NucleusTeq Consulting Pvt. Ltd.
07.2023 - Current

Consultant

Gyansys Infotech Pvt. Ltd
05.2021 - 03.2023

Hadoop /Big Data Consultant

San Information
07.2015 - 05.2021

Post Graduation - Data Science and Business Analytics

University of Texas

Bachelor of Computer Application - Science

MGM's Dr. G. Y. Patharikar College of CS & IT
ANURAG LOHAR