Summary
Overview
Work History
Education
Skills
Certification
Primary No
Activities
Alternative No
Projects
Disclaimer
Timeline
Generic

Swarnendu Sarkar

Staff Engineer
Bengaluru

Summary

To work in a challenging and creative environment contributes towards the goals of the organization.

Overview

9
9
years of professional experience
4
4
Certifications

Work History

Staff Engineer

Altimetrik
  • Worked on Calibo project on SSOT use case from February onwards
  • Hands on Experience on Azure Databricks
  • Experience on Snowflake
  • Experience on Snowflake and SQL store-procedures
  • Experience on AWS S3
  • Experience on LAZSA platform.

Application Development Team Lead

Accenture
04.2021 - 11.2022
  • Experience in analyzing data using HiveQL 1.1.0
  • Experience in Pyspark, Spark SQL
  • Experience in importing and exporting data using Sqoop 1.99.7 from HDFS to RDBMS and vice-versa
  • Experience in working with NoSQL databases like Impala 2.7.0, Hive 1.1.0
  • Hands on experience in Linux Shell Scripting
  • Worked with Big Data distributions Cloudera
  • Hands on experience in AWS (S3, Lambda, SQS, SNS, CloudWatch)
  • Implemented extensive Impala 2.7.0 queries and creating views for adhoc and business processing
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
  • Experience in working with the Different file formats like TEXTFILE, PARQUET, CSV, ORC and JSON
  • Involved in complete end to end code deployment process in Production.

Specialist Programmer

Infosys
11.2020 - 04.2021
  • Be able to design scalable, configurable, and maintainable for complex business problems
  • Excellent understanding of Hadoop Architecture and underlying Hadoop framework including Storage Management
  • Experience in Pyspark, Spark SQL
  • Managed data coming from different sources and involved in HDFS maintenance and loading of structured data.

Associate Software Developer

Cognizant
12.2018 - 11.2020
  • Experience in analyzing data using HiveQL 1.1.0
  • Experience in importing and exporting data using Sqoop 1.99.7 from HDFS to RDBMS and vice-versa
  • Experience in working with NoSQL databases like Impala 2.7.0, Hive 1.1.0
  • Hands on experience in Linux Shell Scripting
  • Worked with Big Data distributions Cloudera
  • Implemented extensive Impala 2.7.0 queries and creating views for adhoc and business processing
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
  • Experience in working with the Different file formats like TEXTFILE, PARQUET, CSV and JSON
  • Involved in complete end to end code deployment process in Production.

Software Developer

HTC Global Services
05.2014 - 11.2018
  • Experience with software development process models like Agile and Waterfall methodologies
  • Good experience in creating real time data solutions using Apache Spark Core, Spark SQL and Data Frames
  • Experience in Spark with Scala
  • Hands on experience with Big Data core components and Ecosystem (Spark Core, Spark SQL, HDFS, Map Reduce, YARN, Zookeeper, Hive and Sqoop)
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
  • Good knowledge in DBMS like IBM DB2, PostgreSQL and MYSQL
  • Experience in working with the Different file formats like TEXTFILE, CSV and JSON
  • Hands on experience in COBOL, JCL, RMS.

Education

B.Tech. (Computer Science & Engineering) -

Murshidabad College of Engineering & Technology, W.B.U.T.

Class 10+2 Higher Secondary - undefined

Gorabazar Iswar Chandra Institution, W.B.C.H.S.E.

Class 10 Madhyamik - undefined

Chhabghati K.D. Vidyalaya, W.B.B.S.E.

Skills

    Pyspark, Spark, Azure Databricks, Map reduce

undefined

Certification

Completed Snowflake SnowPro-Core certification in 2023

Primary No

+91-8074405932

+91-9705903982

Activities

  • Won Rise and Shine employee award in 2024
  • Won Accenture’s iChamp award in 2022
  • Won Accenture’s Key Performer award in 2021
  • My design and development were published on client’s monthly newsletter as their sell increased 80% due to that change
  • Won HTC Global Service’s Best employee of Quarter award in 2017
  • Won State Level Scholarship at Secondary Education in 2005

Alternative No

+91-9705903982

Projects

Project 1: Data Migration from SQL Server to Snowflake

Client: Nature Sweet

Role: Staff Engineer

Environment: Azure, Pyspark, Azure Databricks, Snowflake, Python-based Snowflake Store procedure, LAZSA Platform, Agile Methodology, Jira

Duration: From Nov 2023 to till now

Project Scope: The Program objective is to perform data engineering tasks on various data sources and ingest/cleanse them to Snowflake.

Roles & Responsibilities:

· Provided solution on the architecture level for building any new pipeline or re-designing any pipeline.

· Worked closely with Client/Business Analysts for requirement gatherings.

· Experienced in developing Python-based Snowflake Store procedure to load the data in between multiple layers in Snowflake tables.

· Managed data coming from different sources and loading of structured data into Snowflake landing layer.

· Worked on Pyspark using Azure Databricks to load the data from SQL Server tables to Snowflake tables.

· Conceptualized and designed an end-to-end framework for batch processing.

· Worked on LAZSA platform to design the pipelines and schedule the jobs for batch processing.

· Used Jira for project tracking, Bug tracking and Project Management.

· Involved in complete end to end code deployment process in Production.

Project 2: CALIBO

Client: CALIBO

Role: Staff Engineer

Environment: AWS S3, Pyspark, Azure Databricks, Snowflake, SQL Store procedure, LAZSA Platform, Agile Methodology, Jira

Duration: From Nov 2022 to till now

Project Scope: The Program objective is to perform data engineering tasks on various

data sources and ingest/cleanse them to Snowflake.

Roles & Responsibilities:

· Provided solution on the architecture level for building any new pipeline or re-designing any pipeline.

· Worked closely with Client/Business Analysts for requirement gatherings.

· Experienced in developing SQL Store to load the data in between multiple layers in Snowflake tables.

· Managed data coming from different sources and involved in AWS S3 maintenance and loading of structured data.

· Worked on Pyspark using Azure Databricks to load the data from AWS S3 to Snowflake tables.

· Conceptualized and designed an end-to-end framework for batch processing.

· Worked on LAZSA platform to design the pipelines and schedule the jobs for batch processing.

· Used Jira for project tracking, Bug tracking and Project Management.

· Involved in complete end to end code deployment process in Production.

Project 3: Retailer data ingestion system

Client: UK Retailer Company

Role: Application   Development Team Lead

Environment: AWS S3, HDFS 2.7.3, SPARK 2.4.5, Hive   1.1.0, Pyspark, Hue, Jenkins, AWS services, Winscp, YARN, Agile Methodology,   Jira

Duration: From April 2021 to till Nov 2022

Project Scope: This is a British multinational consumer goods company with headquarters in London, England. Unilever products include food, condiments, ice cream, cleaning agents, beauty products, and personal care. It is the largest producer of soap in the world and its products are available in around 190 countries. The main purpose is to deal with Retailer’s sales, products and its reviews related data. This project came into picture to get data from various sources of retailers dumped into S3 buckets and load the data in Hive tables to provide downstream teams. Here we have several retailer’s sales, products and its reviews related data across the globe. Here we used to get the data from various sources in different file formats (ex. csv, .xlsx, .parquet, .csv.gz, .json etc.) then process the data in 2 stages and then we used to load the data in final Hive tables in parquet/orc file format from where downstream will consumes those data.


Roles & Responsibilities:

· Managed the team   with 10 members.

· Provided solution   on the architecture level for building any new pipeline or re-designing any   pipeline.

· Worked closely   with Client/Business Analysts for requirement gatherings.

· Experienced in   developing Hive Queries on different data formats like Text file, parquet   file, orc file and leveraging time-based partitioning yields improvement in   performance using HiveQL.

· Managed data   coming from different sources and involved in AWS S3 maintenance and loading   of structured data.

· Worked on Pyspark   and Spark-SQL to convert old Hive scripts.

· Implemented   extensive Impala queries and creating views for adhoc and business   processing.

· Conceptualized   and designed an end-to-end framework for batch processing.

· Handled up to 10   Terabytes of Data per day.

· Developed Shell   scripts to automate and configure the jobs in Jenkins tool.

· Worked on Jenkins   tool to schedule the jobs for batch processing.

· Used Jira for   project tracking, Bug tracking and Project Management.

· Involved in   complete end to end code deployment process in Production.

Project 4: Health Legacy system

Client: US Health Insurance Company

Role: Associate   Hadoop Developer

Environment: CDH 5.16.0, HDFS 2.7.3, SPARK 1.6.0,   Hive 1.1.0, Impala 2.7.0, Sqoop 1.4.6, Hue, Rundeck, Putty, Big Decision,   YARN, Agile Methodology, HP - ALM

Duration: From December 2018 to till April   2021

Project Scope: This is one of the United States largest nonprofit health plans. It was established in 1937 to provide New York’s working families with access to medical services regardless of cost. The main purpose is to deal with Health Insurance legacy system data. This project came into picture to get data from Oracle(source) and load the data in Hive tables to provide downstream teams. Here we have several subject area's data like Membership, Claim, Accounting, Product, Commission etc. Here we used to get the data from Oracle(source) as textile format then process the data in 3 stages and then we used to load the data in final Hive tables in parquet file format from where downstream will consumes those data.

Roles & Responsibilities:

· Worked closely with Business Analysts for requirement gatherings.

· Experienced in developing Hive Queries on different data formats like Text file, parquet file and leveraging time-based partitioning yields improvement in performance using HiveQL.

· Managed data coming from different sources and involved in HDFS maintenance and loading of structured data.

· Experience in importing and exporting data using Sqoop from HDFS to RDBMS and vice-versa.

· Implemented extensive Impala queries and creating views for adhoc and business processing.

· Conceptualized and designed an end-to-end framework for batch processing 180 tables of data every day.

· Handled up to 8 Terabytes of Data per day.

· Developed Shell scripts to automate and configure the jobs in Rundeck tool.

· Worked on Big Decision tool for data cleansing.

· Worked on Rundeck tool to schedule the jobs for batch processing.

· Used HP-ALM for project tracking, Bug tracking and Project Management.

· Involved in complete end to end code deployment process in Production.

· Apart from above mentioned skillsets, I take full ownership of the deliverables and also review the assigned work done by other teammates and put my inputs if required.

Project 5: Customer Service Expectations

Client: US Insurance Company

Role: Hadoop Developer

Environment: Hadoop, HDFS, Hive, Yarn, Sqoop, Java, Spring Tool Suite, IBM DB2, UNIX, RMS, Zookeeper and Putty

Duration: From June 2017 to till November 2018

Project Scope: The main aim is to deal agent’s Licenses. This project came into picture as a replacement to the Select Agent criteria. Select Agent Indicator is a special criterion where an agent is given priority than the normal agents to the business when given a search thru user Interfaces. An agent can be called a select agent when they satisfy certain conditions. In this project Select agent criterion was made obsolete and introduced a new concept called customer service expectations (CSE) Agent. Along with current select agent criteria, added some more parameters to decide whether an agent to be called CSE or not.

Roles & Responsibilities:

· Worked closely with business customers for Requirement gatherings.

· Developing Sqoop jobs with incremental load from heterogeneous RDBMS (IBM DB2) using native dB connectors.

· Designed Hive repository with external tables, internal tables, partitions, ACID property and UDF for incremental data load of parsed data for analytical & operational dashboards.

· Experienced in developing Hive Queries on different data formats like Text file, CSV file, Log files and leveraging time-based partitioning yields improvement in performance using HiveQL.

· Created Hive external tables for the data in HDFS and moved data from archive layer to business layer with hive transformations.

· Developed Spark application using Scala against Hive tables to determine the CSE Agents.

· Worked on Revision management system to install the changes in production.

Project 6: Associate Data Movement

Client: US Insurance Company

Role: Software   Development Engineer

Environment: Hadoop, HDFS, Hive, Sqoop, Java, Spring   Tool Suite, IBM DB2, UNIX, RMS, Zookeeper and Putty

Duration:  From June 2016 to till May 2017

Project Scope:  The main purpose of this release is to migrate associate data related to Authorizations, Contacts, and Registrations etc. from Associate Register to Hadoop. While it is not critical for all associate information to be updated in real time, subset of data should be made continuously available for retrieval on the Integrated Customer Platform/Technical Platform. Many Hadoop applications will require associate data such as associate name, contact information and authorization to service specific products.

Roles & Responsibilities:

· Worked closely with business customers for Requirement gatherings.

· Developing Sqoop jobs with incremental load from heterogeneous RDBMS (IBM DB2) using native DB connectors.

· Designed Hive repository with external tables, internal tables, partitions and UDF for incremental data load of parsed data for analytical & operational dashboards.

· Created Hive external tables for the data in HDFS and moved data from archive layer to business layer with hive transformations.

· Developed Business logic.

· Performed unit testing.

· Worked on Revision management system to install the changes in production.

Project 7: Associate Data Movement

Client: US Insurance Company

Role: Software Development Engineer

Environment: COBOL, JCL, IBM DB2, RMS

Duration:  From September 2014 to till May 2016

Project Scope:  It was a Leasing Project where the whole process of Agents, Agent Staffs, Employees and Externals has their different roles. Here Agents, Agent Staffs and employees are internal to Client and others are externals. All the associates’ personal and business data is stored in the tables. Agents will have their policies, agreements, and products to sell to the customers. Agent Staffs and employees are working under Agents. They will handle the work of an Agent. The whole process is used to control through different applications. It is about the lease for movable goods like trucks.

Roles & Responsibilities:

· Prepared project documentation.

· Analyzed the functional documents.

· Developed codes of new modules and enhancement of old modules.

Performed unit testing.

Disclaimer

I hereby declare that the particulars furnished above are true to the best of my knowledge.

Timeline

Application Development Team Lead

Accenture
04.2021 - 11.2022

Specialist Programmer

Infosys
11.2020 - 04.2021

Associate Software Developer

Cognizant
12.2018 - 11.2020

Software Developer

HTC Global Services
05.2014 - 11.2018

Staff Engineer

Altimetrik

B.Tech. (Computer Science & Engineering) -

Murshidabad College of Engineering & Technology, W.B.U.T.

Class 10+2 Higher Secondary - undefined

Gorabazar Iswar Chandra Institution, W.B.C.H.S.E.

Class 10 Madhyamik - undefined

Chhabghati K.D. Vidyalaya, W.B.B.S.E.
Swarnendu SarkarStaff Engineer