Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

N R NAJEEB RAHMAN

AWS Data Engineer
THIRUVANANTHAPURAM

Summary

Result-driven AWS Data Engineer with over three years of experience in designing, implementing, and managing scalable data solutions. Currently working at Cognizant, I have a strong background leveraging AWS Services such as Glue, DMS, S3, Lambda, RDS, and Redshift to develop efficient ETL pipelines and data migration strategies. Proficient in PySpark, specialize in processing large datasets, implementing change data capture (CDC), and automating data workflows using AWS Lambda and Step Functions. Skilled in optimizing data storage and retrieval processes while ensuring compliance with security and data governance standards. With a proven ability to collaborate with cross-functional teams and deliver data-driven insights, dedicated to driving business intelligence and analytics initiatives.

Overview

3
3
years of professional experience
5
5
Certifications

Work History

Databricks PySpark Developer

Cognizant
08.2023 - Current
  • Design, develop, and maintain scalable ETL pipelines using PySpark on Databricks for data transformation, aggregation, and enrichment.
  • Implement data integration solutions to move data from various sources (e.g., APIs, databases, cloud storage) into Databricks Lakehouse and Delta Lake.
  • Leverage Databricks' distributed computing capabilities to perform complex data transformations and analytics.
  • Work with structured and unstructured data formats such as JSON, Parquet, Avro, and Delta Lake.
  • Monitor and troubleshoot PySpark jobs, identifying and resolving performance bottlenecks.
  • Tune PySpark code for efficiency, using techniques such as partitioning, caching, and optimizing cluster resources.
  • Utilize Databricks notebooks for collaborative development, testing, and debugging of PySpark code.
  • Integrate with Databricks Delta Lake to manage data lake storage, maintain ACID transactions, and optimize queries.
  • Use Databricks REST API for automating job execution, cluster management, and workflow orchestration.
  • Implement email notifications for job success, failure, and completion status using Databricks Jobs or external services like AWS SES or SMTP.
  • Write unit and integration tests for PySpark code to ensure data quality and consistency.
  • Use Databricks' built-in security features, such as role-based access controls and data encryption, to secure data pipelines.

AWS Data Engineer

Cognizant
08.2021 - 08.2023
  • Design, build, and maintain scalable data pipelines using AWS Glue with PySpark for ETL processes.
  • Automate data workflows using AWS Lambda functions to trigger data processing and transformation tasks.
  • Implement data migration using AWS DMS to move data from various sources to S3, including change data capture for real-time updates.
  • Implement data integration solutions with AWS services such as S3, DynamoDB, RDS, and Redshift.
  • Manage and optimize data storage in AWS S3, ensuring secure, efficient, and cost-effective storage solutions.
  • Set up data partitioning and cataloging in the Glue Data Catalog for efficient querying and processing using PySpark.
  • Implement event-driven architectures using SNS for real-time data processing and notifications.
  • Implement logging, monitoring, and alerting solutions using CloudWatch, SNS, and Lambda.
  • Apply IAM roles and policies to ensure secure access to AWS resources, such as Glue, S3, and Lambda.
  • Ensure compliance with data security and governance policies, including encryption at rest and in transit.


Education

Bachelor of Technology - Applied Electronics And Instrumentation

College of Engineering Trivandrum
Trivandrum, India
04.2001 -

Skills

Python

Certification

AWS - Amazon Web Services Certified Cloud Practitioner

Timeline

Databricks PySpark Developer

Cognizant
08.2023 - Current

AWS Data Engineer

Cognizant
08.2021 - 08.2023

Bachelor of Technology - Applied Electronics And Instrumentation

College of Engineering Trivandrum
04.2001 -
N R NAJEEB RAHMANAWS Data Engineer