Summary

Overview

Work History

Education

Skills

Certification

Timeline

MANJUNATH M

Bengaluru

Summary

Over all 9+ years of professional experience and 5+ years of relevant expertise in Big Data technologies, including Hadoop, Spark, Hive, delivering scalable data pipelines, ETL solutions, and analytics workflows on CDP and AWS. Skilled in AWS services (S3, IAM), SQL query optimization, Hive (Partitioning, Bucketing, Aggregation), and Spark architecture (Spark Core, Spark SQL, DataFrames). Proven track record of implementing and developing Big Data analytics initiatives and managing large-scale data processing solutions. Proficient in the Hadoop ecosystem (HDFS, Hive, PySpark), Linux, and CI/CD pipelines using Jenkins, Autosys, RLM. Experienced with development tools and version control: Intellij, PyCharm, GitHub, Bitbucket. Strong expertise in production support, defect resolution, RCA, release management, and code reviews. Analytical and fast learner with excellent problem-solving and debugging skills, adaptable to emerging technologies. Strong problem solving and technical skills coupled with clear decision-making and ability to learn quickly and adapt to new technologies. Adept in collaboration, mentorship, and cross-functional teamwork, fostering a culture of continuous improvement. Experience in developing the projects using Agile Methodology. Experienced in tracking incidents via JIRA and ensuring smooth project delivery through Agile best practices.

Overview

years of professional experience

Certification

Work History

Specialist Data Engineering

LTIMindtree

09.2022 - Current

Client : Citi Bank

Project: The AML Markets platform handles large-scale trade and transaction data from global markets to detect potential suspicious activities, insider trading, and money laundering patterns. It ingests high-volume data from various source systems (trade bookings, pricing systems, reference data), applies complex business rules, and generates alerts for investigation by compliance teams.

Roles:

Developed and enhanced data pipelines in PySpark and Hive for AML markets datasets
Used advanced optimization techniques (partitioning, bucketing, broadcast joins) to improve job performance.
Automated manual tasks using Shell scripts to reduce operational workload and improve release timelines.
Spearhead the development, testing, and deployment of data workflows for enterprise-level Datamart systems.
Played a key role in completing the CDH to CDP migration and upgrading Spark2 to Spark3.
Automated the small file compression process by integrating compaction logic (using PySpark and Hive) into existing ingestion workflows, significantly improving query performance and reducing NameNode load.
Automated the integration of Jenkins Jobs with RLM, reducing manual work by 20% and significantly minimizing manual errors.

Technology & Tools: PySpark, HIVE, HDFS, Shell scripts ,VS code ,RLM, Jenkins, Notebook ++,Vim,CDP, AWS (S3,IAM), Linux

Senior Technical Services Specialist

IBM

04.2021 - 09.2022

Client : DBS Bank

DBS is a leading financial services group in Asia with a presence in 18 markets. As a part of the team needs to maintain, process huge amount of data as part of day-to-day operations. As a Big Data Engineer, worked with Risk Management team to insert the processed data in to Hive Table using the Spark Code

Roles:

Collaborated with the Internal/Client BA’s in understanding the requirement and architect a data flow system
Developed Sqoop scripts to Import the data from oracle database into HDFS.
Transferred data into HDFS & deployed PySpark to determine the credit Risk to the Customer.
Developed Pyspark code for faster processing of data
Created hive schemas using performance techniques like partitioning and bucketing.
Worked on Spark RDD transformations, actions to implement business logic using Pyspark
Worked on performance optimization using unravel (Application performance monitoring tool)
Regularly check-in code in GitHub and update the codes in the master branch via git pull.

Technology & Tools: PYSPARK, HIVE, HDFS, SQOOP & CDH, Bitbucket

PySpark Developer

Tech Mahindra

05.2020 - 04.2021

Client: Logitech
Project: Global Data Engineering & Analytics Platform

Roles:

Built and optimized PySpark-based ETL pipelines to process large-scale global data.
Developed Hive-based data warehousing solutions for structured reporting and analytics.
Automated data ingestion and orchestration using Shell scripting, improving job reliability and scheduling.
Managed data storage and retrieval on Amazon S3, ensuring efficient partitioning and compression for cost savings.
Delivered high-performance data transformations (Spark SQL, HiveQL) to support business dashboards and analytics initiatives.

Technology & Tools: PyCharm, Visual Studio code, Python, Pyspark,Hive,Shell Scripting, Jupyter notebook, Notepad++, Vim

Software Development Engineer in Test

Tech Mahindra

12.2018 - 04.2020

Client: Adobe
Project: A call center customer experience management solution (Hendrix)

Roles:

Designed and implemented a BDD automation framework using Cucumber, Selenium, and Java.
Automated UI and REST API tests; integrated with Jenkins for CI/CD execution.
Collaborated in Agile Scrum environment, actively participating in sprint planning and story grooming.
Enhanced automation coverage, reduced manual regression effort, and accelerated release cycles.

Skills: IntelliJ, Core Java, Html, XML and CSS, Selenium WebDriver, SQL Server, Jira, Maven

Software Development Engineer in Test

Prime Focus technologies

09.2015 - 09.2018

Company: Prime Focus technologies
Project: Clear Media ERP (Star Sports)

Roles:

Involved in Design and Development of Hybrid Framework using Selenium and Java.
Involved in writing Test Scripts using Selenium WebDriver.
Involved in execution of Test Cases and Test Scripts and preparation of Consolidated Test Execution Reports.
Involved in writing generic and project specific reusable methods.
Extensive knowledge in Core Java.
Widespread knowledge of HTML, XML, JavaScript, POI, Log4j.
Actively Involvement in DB Testing.
Experience in build tools like Maven.
Experience in Unit testing tools like Junit and TestNG.
API testing using Rest assured and Postman

Project Specific Skills: Core Java, Html, XML and CSS, Selenium WebDriver, JavaScript, SQL Server, Jira, Maven

Education

Bachelors of Engineering - Telecommunication Eng.

M S Ramaiah Institute of Technology

Bengaluru

07-2015

Skills

Domain: Media service, e-commerce, Banking
Programming Languages: HiveQL, SQL, PySpark, Python, Shell Scripting,Jils
Operating System / ERP Version: Windows, Linux

Tools / DB / Packages / Framework / ERP Components: Hadoop (HDFS, Hive and Sqoop), Spark, PySpark, Eclipse, Intellij, PyCharm, WinSCP & Putty
Platforms: AWS, CDP

Certification

Attended various trainings on spark framework and Aws.
AWS Certified Cloud Practitioner
Databricks Associate Developer for Apache Spark 3.0 - Python

Timeline

Specialist Data Engineering

LTIMindtree

09.2022 - Current

Senior Technical Services Specialist

IBM

04.2021 - 09.2022

PySpark Developer

Tech Mahindra

05.2020 - 04.2021

Software Development Engineer in Test

Tech Mahindra

12.2018 - 04.2020

Software Development Engineer in Test

Prime Focus technologies

09.2015 - 09.2018

Bachelors of Engineering - Telecommunication Eng.

M S Ramaiah Institute of Technology

MANJUNATH M

Summary

Overview

Work History

Specialist Data Engineering

Senior Technical Services Specialist

PySpark Developer

Software Development Engineer in Test

Software Development Engineer in Test

Education

Bachelors of Engineering - Telecommunication Eng.

Skills

Certification

Timeline

Specialist Data Engineering

Senior Technical Services Specialist

PySpark Developer

Software Development Engineer in Test

Software Development Engineer in Test

Bachelors of Engineering - Telecommunication Eng.

Similar Profiles

Mohammed Charif DouakhaMohammed Charif Douakha

Sudershan ChawlaSudershan Chawla

Gagan ShankaraiahGagan Shankaraiah

ANIKET WAGHANIKET WAGH

Sachin KumarSachin Kumar