Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

KISANU BHATTACHARYA

Google Cloud Professional Data Engineer | Azure |Confluent Cloud | Snowflake
CHENNAI,Tamil Nadu

Summary

  • 9 Years of professional experience in data engineering space, with expertise in Cloud Products - Google Cloud, Azure, Saas Products - Snowflake, Databricks, PaaS - Confluent Cloud, QLIK Replicate for ETL, ELT Loads, Big Data Tool Stacks - Spark Architecture, Pyspark, Hive,Kafka, ANSI SQL and DW Development.
  • Hands-on experience in various GCP Products and their SDKs - VM Instances and Groups, VPCs, VPC Peering, setup Firewall Rules, BigQuery, GCS buckets, G-Cloud Functions, Pub/Sub, Cloud Shell, GSUTIL Commands, BigQuery CLI, Cloud Data Proc, Stack Driver Monitoring, and Logging.
  • Managing Azure Products - VMs, VM Scalesets, Setting up NSGs and Firewalls, HDInsight Clusters, Blobs and Containers.
  • Designing Snowflake Pipelines - From Ingestion, Transformations and handling CDC Data loads using Streams.
  • Setting up Confluent Cloud in GCP, Azure, and using Managed Connector(Snowflake Sink Connector) to stream data directly to Snowflake.

Overview

9
9
years of professional experience
6
6
years of post-secondary education
2
2
Certifications

Work History

Lead Software Engineer

Impetus Technologies
Chennai, Tamil Nadu
11.2021 - Current

Responsibilities

  • Worked on various Data Ingestion Activities from Multiple Source Systems - Oracle, Hive, Snowflake, Azure Blobs, GCP Buckets, SAP ECC.
  • Design pipelines and processing systems for the data ingested using - HQLs, Cloud Dataproc, Azure HDInsight, Pyspark Jobs, Snowflake Procedures, QLIK Replicate.
  • Migrate the on-going on-premises Kafka Pipeline to Confluent Cloud using Snowflake-Sink Managed Connectors. Further designing a Data Recovery mechanism to transfer and re-load the data.
  • Handling various Cloud Infrastructure Tasks in GCP and Azure such as Azure - Creation of VMs, Network Security Groups, Firewalls; GCP - VMs, Instance groups, VPC Peering, Firewalls and Subnets.
  • Design Lakehouses, Warehouse for the processed data in Hive and Snowflake.
  • Decide constraints on Data Lifecycle Management to move them among various tiers(hot/cold OR nearline/coldline/archival)
  • Cost Analysis of various products and keep an eye on the next gen tool stacks to advise the customers accordingly about the upcoming activities.

Data Engineer - Campaign Execution Analyst

Barclays Bank
CHENNAI, Tamil Nadu
03.2020 - 11.2021

Responsibilities

  • Migrating an entire Department Database to BigQuery and using Data Studio for various reporting.
  • Build data pipelines using Airflow within GCP for ETL related jobs using Airflow Operators.
  • Created Dataproc pipelines using ephemeral clusters to analyze data and pass it onto the subsequent systems.
  • Used Stack Driver Monitoring to view billing of various resources and create/monitor alerts for various services.
  • Using BigQuery data into Pandas OR Spark data frames for advanced ETL capabilities.
  • Created POCs using BigQuery ML to create and administer models for batch data.
  • Used various performance optimization techniques in BQ - Partition, Clustering and Indexing.
  • Performed analysis to denormalize the customer data.
  • Creating Service Accounts for various services and integrating them to produce a resultset.
  • Used Cloud Shell to configure settings for various services in GCP.
  • Creation of marketing and service campaigns for target customers to promote various Retail and Business Products using Teradata –Customer Interaction Manager(CIM).
  • Executing shell scripts to include Teradata prescripts and later schedule them using Tivoli/Airflow.
  • Creating schema in Hive with performance optimization using bucketing & partitioning.
  • Write HQL to transform data for further downstream processing.
    Worked with Impala for executing ad-hoc queries.

Consultant

Deloitte US India
Hyderabad, Telangana
05.2019 - 03.2020

Responsibilities

  • Experience in building analytical pipelines using Spark and Hive.
  • Experience in handling both Relational and Non-Relational Databases.
  • Involved in multiple Data Cleansing and Transformation Activity using Pyspark and Hive.
  • Creating BQ Scheduled Jobs to perform migration activities for batch data from Cloud Storage.
  • Stepped into SAP practice as a novice developer and gained expertise in quick time of handling any kind of objects.
  • Objects developed end to end are: Customer Master, Sales Contract, IM Inventory and Open AR.
  • Creating new jobs, tune existing ones and prepare validation report for client to validate data before load.
  • Working Extensively with IDOCs, RFC Functions, BAPIs and TCODES.
  • Creating Technical Specification Document for migration from ECC to S4.
  • Liaising with client team to create/update the relevancy rules and accommodate same in the current workflow.
  • Additional activities - includes creating data pipelines using Pyspark where data was improved, rendered and used further to present it in dashboards

Senior Analyst

Ernst & Young LLP
CHENNAI, Tamil Nadu
05.2016 - 04.2019

Responsibilities

  • Analyzing legacy business business requirements and pouring them into ETL flows using SAP Data Services for Guidewire Data Migration and Upgrade Process.
  • Created data pipelines using SAP Data Services which is responsible to integrate data among source(legacy data) and target systems(Guidewire Claim Center)
  • Handled multiple source systems to create a data lake/staging area, from where the data can be transformed for further downstream systems.
  • Coding, testing, modifying, debugging, documenting and implementation of workflows and dataflows as per client requirement.
  • Ad Hoc Reporting using Tableau to generate random insights.
  • Used Spark API to stream data from various sources in real-time.
  • Implemented a Spark code in Scala using Spark SQL and Data Frames for aggregation.
  • Used Sqoop to ingest & retrieve data from multiple RDBMS like MySQL and Oracle SQL.
  • Hands on experience in designing Regression and Classification models for business to predict reserve amount and whether a customer is going to buy/renew the policy or not respectively.

ETL Developer

AON HEWITT
CHENNAI, Tamil Nadu
09.2013 - 05.2016

Responsibilities

  • Translate business requirements and technical designs into test specifications and test cases.
  • Designing custom maps using transformations in Informatica Power Center 9x version.
  • ETL Tool: Informatica Designer, Workflow Manager and Workflow Monitor.
  • Integrate data from multiple sources including csv, Mainframe, flat files, etc.
  • Strong knowledge in developing and executing SQLs in DB2 and Oracle.
  • Experience in testing Extraction, Transformation and Loading (ETL) mechanism-using Informatica.

Education

Master of Computer Applications -

SRM University
Chennai
07.2010 - 05.2013

Bachelor of Computer Applications -

Agra University
Lucknow
09.2007 - 07.2010

Skills

Google Cloud - VM Instances and Groups, VPCs, VPC Peering, setup Firewall Rules, BigQuery, GCS buckets, G-Cloud Functions, Pub/Sub, Cloud Shell, GSUTIL Commands, BigQuery CLI, Cloud Data Proc, Stack Driver Monitoring, and Logging

undefined

Certification

Google Cloud Certified Professional Data Engineer

Timeline

Databricks Lakehouse Fundamentals

06-2022

Google Cloud Certified Professional Data Engineer

12-2021

Lead Software Engineer

Impetus Technologies
11.2021 - Current

Data Engineer - Campaign Execution Analyst

Barclays Bank
03.2020 - 11.2021

Consultant

Deloitte US India
05.2019 - 03.2020

Senior Analyst

Ernst & Young LLP
05.2016 - 04.2019

ETL Developer

AON HEWITT
09.2013 - 05.2016

Master of Computer Applications -

SRM University
07.2010 - 05.2013

Bachelor of Computer Applications -

Agra University
09.2007 - 07.2010
KISANU BHATTACHARYAGoogle Cloud Professional Data Engineer | Azure |Confluent Cloud | Snowflake