Summary

Overview

Work History

Education

Skills

Certification

Primary No

Activities

Alternative No

Projects

Disclaimer

Timeline

Swarnendu Sarkar

Staff Engineer

Bengaluru

Summary

To work in a challenging and creative environment contributes towards the goals of the organization.

Overview

years of professional experience

Certifications

Work History

Staff Engineer

Altimetrik

Worked on Calibo project on SSOT use case from February onwards
Hands on Experience on Azure Databricks
Experience on Snowflake
Experience on Snowflake and SQL store-procedures
Experience on AWS S3
Experience on LAZSA platform.

Application Development Team Lead

Accenture

04.2021 - 11.2022

Experience in analyzing data using HiveQL 1.1.0
Experience in Pyspark, Spark SQL
Experience in importing and exporting data using Sqoop 1.99.7 from HDFS to RDBMS and vice-versa
Experience in working with NoSQL databases like Impala 2.7.0, Hive 1.1.0
Hands on experience in Linux Shell Scripting
Worked with Big Data distributions Cloudera
Hands on experience in AWS (S3, Lambda, SQS, SNS, CloudWatch)
Implemented extensive Impala 2.7.0 queries and creating views for adhoc and business processing
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
Experience in working with the Different file formats like TEXTFILE, PARQUET, CSV, ORC and JSON
Involved in complete end to end code deployment process in Production.

Specialist Programmer

Infosys

11.2020 - 04.2021

Be able to design scalable, configurable, and maintainable for complex business problems
Excellent understanding of Hadoop Architecture and underlying Hadoop framework including Storage Management
Experience in Pyspark, Spark SQL
Managed data coming from different sources and involved in HDFS maintenance and loading of structured data.

Associate Software Developer

Cognizant

12.2018 - 11.2020

Experience in analyzing data using HiveQL 1.1.0
Experience in importing and exporting data using Sqoop 1.99.7 from HDFS to RDBMS and vice-versa
Experience in working with NoSQL databases like Impala 2.7.0, Hive 1.1.0
Hands on experience in Linux Shell Scripting
Worked with Big Data distributions Cloudera
Implemented extensive Impala 2.7.0 queries and creating views for adhoc and business processing
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
Experience in working with the Different file formats like TEXTFILE, PARQUET, CSV and JSON
Involved in complete end to end code deployment process in Production.

Software Developer

HTC Global Services

05.2014 - 11.2018

Experience with software development process models like Agile and Waterfall methodologies
Good experience in creating real time data solutions using Apache Spark Core, Spark SQL and Data Frames
Experience in Spark with Scala
Hands on experience with Big Data core components and Ecosystem (Spark Core, Spark SQL, HDFS, Map Reduce, YARN, Zookeeper, Hive and Sqoop)
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
Good knowledge in DBMS like IBM DB2, PostgreSQL and MYSQL
Experience in working with the Different file formats like TEXTFILE, CSV and JSON
Hands on experience in COBOL, JCL, RMS.

Education

B.Tech. (Computer Science & Engineering) -

Murshidabad College of Engineering & Technology, W.B.U.T.

Class 10+2 Higher Secondary - undefined

Gorabazar Iswar Chandra Institution, W.B.C.H.S.E.

Class 10 Madhyamik - undefined

Chhabghati K.D. Vidyalaya, W.B.B.S.E.

Skills

Pyspark, Spark, Azure Databricks, Map reduce

undefined

Certification

Completed Snowflake SnowPro-Core certification in 2023

Primary No

+91-8074405932

+91-9705903982

Activities

Won Rise and Shine employee award in 2024
Won Accenture’s iChamp award in 2022
Won Accenture’s Key Performer award in 2021
My design and development were published on client’s monthly newsletter as their sell increased 80% due to that change
Won HTC Global Service’s Best employee of Quarter award in 2017
Won State Level Scholarship at Secondary Education in 2005

Alternative No

+91-9705903982

Projects

Project 1: Data Migration from SQL Server to Snowflake

Client: Nature Sweet

Role: Staff Engineer

Environment: Azure, Pyspark, Azure Databricks, Snowflake, Python-based Snowflake Store procedure, LAZSA Platform, Agile Methodology, Jira

Duration: From Nov 2023 to till now

Project Scope: The Program objective is to perform data engineering tasks on various data sources and ingest/cleanse them to Snowflake.

Roles & Responsibilities:

· Provided solution on the architecture level for building any new pipeline or re-designing any pipeline.

· Worked closely with Client/Business Analysts for requirement gatherings.

· Experienced in developing Python-based Snowflake Store procedure to load the data in between multiple layers in Snowflake tables.

· Managed data coming from different sources and loading of structured data into Snowflake landing layer.

· Worked on Pyspark using Azure Databricks to load the data from SQL Server tables to Snowflake tables.

· Conceptualized and designed an end-to-end framework for batch processing.

· Worked on LAZSA platform to design the pipelines and schedule the jobs for batch processing.

· Used Jira for project tracking, Bug tracking and Project Management.

· Involved in complete end to end code deployment process in Production.

Project 2: CALIBO

Client: CALIBO

Role: Staff Engineer

Environment: AWS S3, Pyspark, Azure Databricks, Snowflake, SQL Store procedure, LAZSA Platform, Agile Methodology, Jira

Duration: From Nov 2022 to till now

Project Scope: The Program objective is to perform data engineering tasks on various

data sources and ingest/cleanse them to Snowflake.

Roles & Responsibilities:

· Provided solution on the architecture level for building any new pipeline or re-designing any pipeline.

· Worked closely with Client/Business Analysts for requirement gatherings.

· Experienced in developing SQL Store to load the data in between multiple layers in Snowflake tables.

· Managed data coming from different sources and involved in AWS S3 maintenance and loading of structured data.

· Worked on Pyspark using Azure Databricks to load the data from AWS S3 to Snowflake tables.

· Conceptualized and designed an end-to-end framework for batch processing.

· Worked on LAZSA platform to design the pipelines and schedule the jobs for batch processing.

· Used Jira for project tracking, Bug tracking and Project Management.

· Involved in complete end to end code deployment process in Production.

Project 3: Retailer data ingestion system

Client: UK Retailer Company

Role: Application Development Team Lead

Environment: AWS S3, HDFS 2.7.3, SPARK 2.4.5, Hive 1.1.0, Pyspark, Hue, Jenkins, AWS services, Winscp, YARN, Agile Methodology, Jira

Duration: From April 2021 to till Nov 2022

Project Scope: This is a British multinational consumer goods company with headquarters in London, England. Unilever products include food, condiments, ice cream, cleaning agents, beauty products, and personal care. It is the largest producer of soap in the world and its products are available in around 190 countries. The main purpose is to deal with Retailer’s sales, products and its reviews related data. This project came into picture to get data from various sources of retailers dumped into S3 buckets and load the data in Hive tables to provide downstream teams. Here we have several retailer’s sales, products and its reviews related data across the globe. Here we used to get the data from various sources in different file formats (ex. csv, .xlsx, .parquet, .csv.gz, .json etc.) then process the data in 2 stages and then we used to load the data in final Hive tables in parquet/orc file format from where downstream will consumes those data.

Roles & Responsibilities:

· Managed the team with 10 members.

· Provided solution on the architecture level for building any new pipeline or re-designing any pipeline.

· Worked closely with Client/Business Analysts for requirement gatherings.

· Experienced in developing Hive Queries on different data formats like Text file, parquet file, orc file and leveraging time-based partitioning yields improvement in performance using HiveQL.

· Managed data coming from different sources and involved in AWS S3 maintenance and loading of structured data.

· Worked on Pyspark and Spark-SQL to convert old Hive scripts.

· Implemented extensive Impala queries and creating views for adhoc and business processing.

· Conceptualized and designed an end-to-end framework for batch processing.

· Handled up to 10 Terabytes of Data per day.

· Developed Shell scripts to automate and configure the jobs in Jenkins tool.

· Worked on Jenkins tool to schedule the jobs for batch processing.

· Used Jira for project tracking, Bug tracking and Project Management.

· Involved in complete end to end code deployment process in Production.

Project 4: Health Legacy system

Client: US Health Insurance Company

Role: Associate Hadoop Developer

Environment: CDH 5.16.0, HDFS 2.7.3, SPARK 1.6.0, Hive 1.1.0, Impala 2.7.0, Sqoop 1.4.6, Hue, Rundeck, Putty, Big Decision, YARN, Agile Methodology, HP - ALM

Duration: From December 2018 to till April 2021

Project Scope: This is one of the United States largest nonprofit health plans. It was established in 1937 to provide New York’s working families with access to medical services regardless of cost. The main purpose is to deal with Health Insurance legacy system data. This project came into picture to get data from Oracle(source) and load the data in Hive tables to provide downstream teams. Here we have several subject area's data like Membership, Claim, Accounting, Product, Commission etc. Here we used to get the data from Oracle(source) as textile format then process the data in 3 stages and then we used to load the data in final Hive tables in parquet file format from where downstream will consumes those data.

Roles & Responsibilities:

· Worked closely with Business Analysts for requirement gatherings.

· Experienced in developing Hive Queries on different data formats like Text file, parquet file and leveraging time-based partitioning yields improvement in performance using HiveQL.

· Managed data coming from different sources and involved in HDFS maintenance and loading of structured data.

· Experience in importing and exporting data using Sqoop from HDFS to RDBMS and vice-versa.

· Implemented extensive Impala queries and creating views for adhoc and business processing.

· Conceptualized and designed an end-to-end framework for batch processing 180 tables of data every day.

· Handled up to 8 Terabytes of Data per day.

· Developed Shell scripts to automate and configure the jobs in Rundeck tool.

· Worked on Big Decision tool for data cleansing.

· Worked on Rundeck tool to schedule the jobs for batch processing.

· Used HP-ALM for project tracking, Bug tracking and Project Management.

· Involved in complete end to end code deployment process in Production.

· Apart from above mentioned skillsets, I take full ownership of the deliverables and also review the assigned work done by other teammates and put my inputs if required.

Project 5: Customer Service Expectations

Client: US Insurance Company

Role: Hadoop Developer

Environment: Hadoop, HDFS, Hive, Yarn, Sqoop, Java, Spring Tool Suite, IBM DB2, UNIX, RMS, Zookeeper and Putty

Duration: From June 2017 to till November 2018

Project Scope: The main aim is to deal agent’s Licenses. This project came into picture as a replacement to the Select Agent criteria. Select Agent Indicator is a special criterion where an agent is given priority than the normal agents to the business when given a search thru user Interfaces. An agent can be called a select agent when they satisfy certain conditions. In this project Select agent criterion was made obsolete and introduced a new concept called customer service expectations (CSE) Agent. Along with current select agent criteria, added some more parameters to decide whether an agent to be called CSE or not.

Roles & Responsibilities:

· Worked closely with business customers for Requirement gatherings.

· Developing Sqoop jobs with incremental load from heterogeneous RDBMS (IBM DB2) using native dB connectors.

· Designed Hive repository with external tables, internal tables, partitions, ACID property and UDF for incremental data load of parsed data for analytical & operational dashboards.

· Experienced in developing Hive Queries on different data formats like Text file, CSV file, Log files and leveraging time-based partitioning yields improvement in performance using HiveQL.

· Created Hive external tables for the data in HDFS and moved data from archive layer to business layer with hive transformations.

· Developed Spark application using Scala against Hive tables to determine the CSE Agents.

· Worked on Revision management system to install the changes in production.

Project 6: Associate Data Movement

Client: US Insurance Company

Role: Software Development Engineer

Environment: Hadoop, HDFS, Hive, Sqoop, Java, Spring Tool Suite, IBM DB2, UNIX, RMS, Zookeeper and Putty

Duration: From June 2016 to till May 2017

Project Scope: The main purpose of this release is to migrate associate data related to Authorizations, Contacts, and Registrations etc. from Associate Register to Hadoop. While it is not critical for all associate information to be updated in real time, subset of data should be made continuously available for retrieval on the Integrated Customer Platform/Technical Platform. Many Hadoop applications will require associate data such as associate name, contact information and authorization to service specific products.

Roles & Responsibilities:

· Worked closely with business customers for Requirement gatherings.

· Developing Sqoop jobs with incremental load from heterogeneous RDBMS (IBM DB2) using native DB connectors.

· Designed Hive repository with external tables, internal tables, partitions and UDF for incremental data load of parsed data for analytical & operational dashboards.

· Created Hive external tables for the data in HDFS and moved data from archive layer to business layer with hive transformations.

· Developed Business logic.

· Performed unit testing.

· Worked on Revision management system to install the changes in production.

Project 7: Associate Data Movement

Client: US Insurance Company

Role: Software Development Engineer

Environment: COBOL, JCL, IBM DB2, RMS

Duration: From September 2014 to till May 2016

Project Scope: It was a Leasing Project where the whole process of Agents, Agent Staffs, Employees and Externals has their different roles. Here Agents, Agent Staffs and employees are internal to Client and others are externals. All the associates’ personal and business data is stored in the tables. Agents will have their policies, agreements, and products to sell to the customers. Agent Staffs and employees are working under Agents. They will handle the work of an Agent. The whole process is used to control through different applications. It is about the lease for movable goods like trucks.

Roles & Responsibilities:

· Prepared project documentation.

· Analyzed the functional documents.

· Developed codes of new modules and enhancement of old modules.

Performed unit testing.

Disclaimer

I hereby declare that the particulars furnished above are true to the best of my knowledge.

Timeline

Application Development Team Lead

Accenture

04.2021 - 11.2022

Specialist Programmer

Infosys

11.2020 - 04.2021

Associate Software Developer

Cognizant

12.2018 - 11.2020

Software Developer

HTC Global Services

05.2014 - 11.2018

Staff Engineer

Altimetrik

B.Tech. (Computer Science & Engineering) -

Murshidabad College of Engineering & Technology, W.B.U.T.

Class 10+2 Higher Secondary - undefined

Gorabazar Iswar Chandra Institution, W.B.C.H.S.E.

Class 10 Madhyamik - undefined

Chhabghati K.D. Vidyalaya, W.B.B.S.E.

Similar Profiles

Sankara Narayanan SSankara Narayanan S
Staff Engineer | Data Engineering at AltimetrikStaff Engineer | Data Engineering at Altimetrik
Pradeep ThotapallePradeep Thotapalle
Client Partner (Business Development) at AltimetrikClient Partner (Business Development) at Altimetrik
MANO PRIYA RMANO PRIYA R
Data Engineer at AltimetrikData Engineer at Altimetrik
KARTHIKEYAN SAKTHIVELKARTHIKEYAN SAKTHIVEL
Senior Engineer - Product and Platform Engineering at AltimetrikSenior Engineer - Product and Platform Engineering at Altimetrik
Sabir PashaSabir Pasha
Senior Technical Solutions Engineer at BARRACUDA NETWORKSSenior Technical Solutions Engineer at BARRACUDA NETWORKS

CREATE PROFILE

Summary

Overview

Work History

Staff Engineer

Application Development Team Lead

Specialist Programmer

Associate Software Developer

Software Developer

Education

B.Tech. (Computer Science & Engineering) -

Class 10+2 Higher Secondary - undefined

Class 10 Madhyamik - undefined

Skills

Certification

Primary No

Activities

Alternative No

Projects

Disclaimer

Timeline

Application Development Team Lead

Specialist Programmer

Associate Software Developer

Software Developer

Staff Engineer

B.Tech. (Computer Science & Engineering) -

Class 10+2 Higher Secondary - undefined

Class 10 Madhyamik - undefined

Similar Profiles

Sankara Narayanan SSankara Narayanan S

Pradeep ThotapallePradeep Thotapalle

MANO PRIYA RMANO PRIYA R

KARTHIKEYAN SAKTHIVELKARTHIKEYAN SAKTHIVEL

Sabir PashaSabir Pasha