Results-driven Data Engineer with over 6 years of proven experience in data pipeline construction using Python, SQL, and the Spark framework. Expertise in orchestrating data integration, migration, and analytics, leveraging cloud computing to optimize end-to-end scalable data solutions. Seeking a challenging role in the data domain to lead organizational growth.
Overview
6
6
years of professional experience
1
1
Certification
Work History
Data Engineer
Tata Consultancy Services Limited
KOLKATA
10.2020 - Current
Client: Leading UK Investment Firm
Project: Digital Experience Platform (Web and Mobile)
Technologies: Python, PySpark, AWS, MySQL DB
Engineered scalable data pipelines facilitating efficient data processing and visualization as part of digital cloud migration process.
Orchestrated data ingestion, storage and transformation through ETL processes leveraging Apache Spark, AWS and SQL.
Utilized AWS Database Migration Service (DMS) CDC process, ensuring zero data loss and 99.9% uptime to ingest on-premises data with support for CSV, Parquet and JSON formats.
Analyzed user requirements and developed data models for a robust Data Lake utilizing S3 HUDI tables, DynamoDB and RDS ensuring efficient upsert and delete operations.
Built and optimized serverless workflows on AWS (S3, Lambda, SQS, Glue, Step Functions and EventBridge), enabling real-time analytics and reducing infrastructure costs by 20%.
Wrote complex SQL queries for data validation and transformation, improving data accuracy and query performance by 15% utilizing PySpark SQL.
Utilized Python libraries such as Pandas, NumPy, Boto3, and PyMySQL for data manipulation, analysis, and visualization, enhancing data-driven decision-making.
Built real-time data querying capabilities using AWS Athena on partitioned and catalogued datasets with integrated monitoring and alerting with CloudWatch and SNS/SQS notifications.
Configured and maintained cloud-based data infrastructure on AWS using CloudFormation while maintaining KMS encryption and IAM access control policies.
Managed version control and deployment of data applications using Git, and Jenkins.
System Engineer
Tata Consultancy Services Limited
KOLKATA
06.2019 - 09.2020
Client: Leading US Retail Pharmacy Chain
Project: Data Migration for lower TCO and improved performance
Technologies: Ab Initio, Talend, Java, Hadoop (HDFS, YARN, MapReduce)
Analyzed data sets to ensure accurate migration between systems, reducing TCO by about 60%.
Migrated complex Ab Initio graphs to Talend, utilizing Hadoop for big data processing and improving performance (process time reduced up to 80%).
Improved efficiency by implementing data cataloging and metadata management.
Navigated intricate challenges associated with source-to-target mappings through adept problem-solving strategies.
Developed and executed test plans to validate data integrity and volume using automation scripts.
Education
Bachelor of Technology - Electronics And Communication Engineering
Soft Skills: Problem-Solving, Team Collaboration, Stakeholder Communication
Accomplishments
On the Spot Award at Tata Consultancy Services Limited - awarded for achieving a robust data migration solution with high scalability and quick turnaround, while maintaining project budget.
Star Team Award at Tata Consultancy Services Limited - awarded for completing crucial delivery that expedited real-time user onboarding process.
Conference Paper presentation at the Indian Science Congress, 107th edition on - Identification and Impact of Wormhole Attacks in MANET Networks.
Semi-finalist in India Innovation Challenge Design Contest 2016 organized by Texas Instruments and DST - for developing SOUL, a wearable device.
Certification
AWS Certified Developer - Associate
Agile foundation using Jile - Tata Consultancy Services Limited
Projects
Identification and Impact of Wormhole Attacks in MANET Networks - (NS-3, C++, Python) Detection and simulation of wormhole attacks in MANET (Mobile Adhoc Network) and utilizing AODV routing protocol with UDP to deliver best throughput (237.93 kbps) and nominal packet loss ratio (around 2%).
Vessel Tracking in Angiograms - (Python, Matplotlib and MATLAB) Centerline tracking and image thresholding using vessel filter techniques, filter design and noise reduction in image. Early blockage detection technique with 97% accuracy.
Languages
English
First Language
Bengali
Proficient (C2)
C2
Hindi
Upper Intermediate (B2)
B2
References
References available upon request.
Timeline
Data Engineer
Tata Consultancy Services Limited
10.2020 - Current
System Engineer
Tata Consultancy Services Limited
06.2019 - 09.2020
Bachelor of Technology - Electronics And Communication Engineering