Summary
Overview
Work History
Education
Skills
Accomplishments
Certifications
Languages
Timeline
Generic
Ishita Karmakar

Ishita Karmakar

Bankura

Summary

Driven data engineer with +3 year experience of extensive experience in the development of Oozie workflows, PySpark scripts, Spark-SQL and applications using Big Data frameworks. Proven ability to support multiple projects, under tight deadlines, often with competing priorities and complexities.

Overview

3
3
years of professional experience

Work History

Associate

Cognizant Technology Solutions
Kolkata
07.2023 - Current
  • Development of Data Quality Check Controls:
  • Developed pySpark script for enhancing Data Quality by implementing controls on multiple data products
  • Facilitated audit procedures by embedding the developed pyspark script within established processes.
  • Actuarial Analytics:
  • Converted python scripts to PySpark resulting in a 30% reduction in job run time for previous processes.
  • Enhanced Python code and wrapper script to align with business requirements for actuarial assumption study process.
  • Developed HQL scripts to extract data from DataLake based on client specifications.
  • Streamlined EQV Mortality process through development of Automation Scripts, resulting in generation of required output files by the business.

Programmer Analyst

Cognizant Technology Solutions
Kolkata
08.2022 - 06.2023
  • Conversion of Avro data format to Parquet :
  • Developed pySpark script to back up old avro data, convert old avro files to parquet format, and recreate tables in parquet format.
  • Code remediation in oozie workflows to support the data ingestion in parquet format.
  • Developed data validation {ySpark script after conversion from avro data format to parquet.
  • CDC logic development for a US-based insurance technology company:
  • Developed Change Data Capture (CDC) logic using pySpark and Spark-SQL to ingest, transform and aggregate data
  • Ingestion of various tables from multiple data sources in datalake using Apache Hadoop
  • Hive table creation and writing HQL (Hive query language).
  • Created lookup tables in hive in order to maintain the performance and modification of data.
  • Design and development of Oozie workflow jobs for scheduling queries/scripts & actions.

Programmer Analyst Trainee

Cognizant Technology Solutions
Kolkata
07.2021 - 07.2022
  • Migration from Cloudera Distribution Hadoop to Cloudera Data Platform :
  • Code Remediation for the migration from Cloudera Distribution Hadoop to Cloudera Data Platform
  • Involved in remediating the existing Pig scripts to Pyspark Scripts
  • Conversion of Pig Actions in Oozie workflow to Spark Actions
  • Development of automation scripts using Python to convert managed hive tables to external tables and addition of various set commands in hive scripts for the CDP environment
  • Ran CICD pipeline, created release in Azure DevOps and deployed in production.

Education

Executive M.Tech - BigData and Blockchain Technologies

Indian Institute of Technology Patna
06.2024

Bachelor of Technology - Computer Science

University of Petroleum And Energy Studies
Dehradun, UT
06-2021

Skills

TECHNICAL SKILLS :

  • Big Data Ecosystems: Hadoop, HDFS, Hive, Sqoop, Spark, Oozie
  • Programming Languages: SQL, Python
  • Hadoop Distributions: Cloudera – CDH 6, CDP
  • Operating Systems: Microsoft Windows, UNIX
  • Cloud Ecosystem: Azure, Azure Data Lake Storage (ADLS), Azure DevOps

SOFT SKILLS:

  • Team Player, Growth Mindset, Adaptibility, Active Listening, Empathetic

Accomplishments

  • Internship at IIT Kharagpur (05/2020 - 07/2020)
  • IMPROVING K-MEANS CLUSTERING USING DETERMINANTAL POINT PROCESS: One of the major drawbacks of k-means clustering is that it selects the initial centroids randomly, hence if we select the centres that are too close to each other then it might result to give bad final cluster centres. Using determinantal point process algorithm we can overcome this drawback
  • ACHIEVEMENTS
  • Received cheer points for the hard work and extensive production support during migration from CDH to CDP.
  • Received honorarium payout for being a buddy mentor for the Cognizant - Interns, by helping them out for the code issues in their dummy project work.

Certifications

Microsoft Azure Fundamentals - AZ 900

Languages

  • English
  • Hindi
  • Bengali

Timeline

Associate

Cognizant Technology Solutions
07.2023 - Current

Programmer Analyst

Cognizant Technology Solutions
08.2022 - 06.2023

Programmer Analyst Trainee

Cognizant Technology Solutions
07.2021 - 07.2022

Executive M.Tech - BigData and Blockchain Technologies

Indian Institute of Technology Patna

Bachelor of Technology - Computer Science

University of Petroleum And Energy Studies
Ishita Karmakar