

Experienced Senior Lead Data Engineer with over 18 years of expertise in designing and implementing large-scale data engineering, AI, and cloud solutions across AWS and Azure platforms. Skilled in building RAG-based AI assistants using Azure OpenAI (Mosaic AI), Databricks, and Microsoft Teams for enterprise automation. Proficient in Databricks Lakehouse architecture (Delta Lake, Photon, Unity Catalog) and ETL pipeline development using PySpark, Python, SQL, Kafka, DBT, and REST APIs. Strong hands-on experience with AWS (S3, Glue, EMR, Redshift, Lambda) and Azure (Data Lake, Synapse, ADF, Purview) for scalable, governed, and cost-efficient data platforms. Adept in MLflow-based tracking, Terraform, and GitLab CI/CD for automation and optimized cloud deployments.
Databricks platform (Delta Lake, Photon, Unity Catalog)
Cloud platforms: AWS (S3, Glue, EMR, Redshift), Azure (Data Lake, Synapse, OpenAI)
AI/ML: RAG, NLP, Vector Search, MLflow tracking, prompt optimization
Real-time data pipelines (Kafka, REST APIs, Auto Loader, DLT)
ETL/ELT frameworks: PySpark, DBT, AWS Glue, ADF
DevOps automation: Terraform, GitLab CI/CD
Data modeling: Star Schema, Data Vault, SCD Type-2
Governance & compliance: Unity Catalog, IAM, KMS, Purview
Visualization: Power BI, QuickSight
Programming: PySpark, Python, SQL