Summary
Overview
Work History
Education
Skills
Timeline
Generic

Lubna Ishaq

Data Engineer
Hyderabad,TG

Summary

Having total of 2 years experience in building big data ecosystem. Hands-on expertise in spark – based ETL workflows, including data ingestion, trasformation, and aggregations for large-scale datasets. Maintained and Monitored spark clusters on AWS EMR, ensuring high availability and fault tolerance. Experienced in optimizing spark SQL performance by tuning various configuration settings such as, memory allocation, caching, serialization. Integarated AWS S3 with pyspark jobs to handle large datasets in a distributed environment. Managed ETL processes with pyspark running on AWS EMR, utilizing AWS S3 for storage. Expertise in developing and deploying serverless applications using google cloud functions, enabling cost-effective and scalable solutions. Familarity with google cloud storage buckets, object lifecycle policies, and access control mechanisms to ensure data availability and compliance. Created DAG templates to standardize job orchestration across multiple spark use cases. Expertise in testing ETL workflows and job scheduling mechanisms. Experienced in testing data integration and synchronization between different systems using ETL process. Skilled in writing efficient SQL queries for data extraction, cleansing and reorting across relational and distributed database. Proficient in python scripting for data manipulation, automation and integrated with big data farmework and APIs. Expereince deploying data solutions on cloud infrastructure including AWS S3, EC2, lambda and Azure data lake, ensuring high availability and performance. Knowledge of workflow orchestration tools like Apache airflow and version control system (GIT) for collaborative development.

Overview

2
2
years of professional experience

Work History

STARLITE INFOTECH LIMITED
09.2023 - 08.2025

CleverTap
09.2023 - 01.2025
  • Developed and maintained scalable ETL pipelines using Apache spark and pyspark to process large volumes of structured and semi-structured data from diversw source.
  • Worked with spark ‘s data serialization formarts (AVRO, Parquet, JSON, etc).
  • Built end-to-end pyspark pipelines on AWS EMR, reading data from AWS S3.
  • Designed and executed SQL queries for data extraction, trasformations , and aggregation, supporting business intelligence dashboards and ad hoc reporting.
  • Automated data ingestion and transformation tasks using python scripts, improving pipelines efficiency and reducing manual intervention.
  • Managed auto-scaling configuration on google compute engine instances to adapt to fluctuating workloads and reduce operational costs.
  • Used airflows pythonoperator and bashoperator to preprocess EMR step arguments and managed dependencies.
  • Applied performance tunning techniques to spark jobs ans SQL queries, achieving up to 30% reduction in execution time.
  • Participated in the migration of legacy data systems to cloud-native architectures, improving scalability and reducing infrastructure cost.
  • Collaborated with data analysts and business teams to understand data requirements and deliver clean, validated datasets for reporting and analytics.
  • Implemented data validation and quality checks to ensure accuracy, completeness, and consistency across data pipelines.

Education

Degree -

Shadan degree college
01.2023

High School -

International Indian School
01.2020

High School -

International Indian School
01.2018

Skills

  • Apache Spark
  • Pyspark
  • Python
  • SQL
  • AWS
  • Azure
  • Spark tuning techniques
  • Query optimization
  • ETL
  • Data Pipelines
  • Data validation
  • Quality checks

Timeline

STARLITE INFOTECH LIMITED
09.2023 - 08.2025

CleverTap
09.2023 - 01.2025

High School -

International Indian School

High School -

International Indian School

Degree -

Shadan degree college
Lubna IshaqData Engineer