Summary
Overview
Work History
Education
Skills
Accomplishments
Additional Information
Timeline
Generic

Malothu RamKalyanNaik

Summary

  • Over 6.8 Years of IT experience on data driven application design and development with 5 years of relevant hands-on-experience in various Azure data services..
  • Proficient in Azure technologies such as Azure Data Factory (ADF), Azure DataBricks (ADB),Azure Synapse Analytics, Azure Active Directory, Azure Storage, Azure data Lake Services (ADLS), Azure key vault, Azure SQL DB, Azure HDInsight.
  • Having good hands on experience on Azure DevOps (ADO) services like Repos, Boards, Build Pipelines (CI/CD), Ansible (yaml scripting) for resource orchestration and code deployment.
  • Hands on Experience developing data engineering frameworks and notebooks using Azure databricks using Spark SQL, Scala, pyspark.
  • Experience in Apache Hadoop frameworks such as Hdfs,Map Reduce, Hive etc.
  • Experience in Microsoft Azure Cloud with Data Factory,LinkServices,HDI Cluster ,DataLake Gen2and DataBricks.
  • Good knowledge on Azure synapse Analytics.
  • Proficient in Big data ingestion tools like kafka, spark streaming and sqoop for streaming and batch data ingestion.
  • Worked with Big Data distributions like Hortonworks (Hortonworks 2.1) with Ambari.
  • Hands-on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Hands-on experience working with IDE tools such as Eclipse, NetBeans and Maven.
  • Worked on Tableau to generate reports.

Overview

8
8
years of professional experience

Work History

senior data engineer

Puresoftware technologies pvt Ltd
03.2023 - Current

Azure Data Developer

ACERT IT SOLUTIONS PRIVATE LIMITED
01.2017 - 03.2023

Education

B.Tech - ECE

National institute of technology,Warangal University(NITW)

Skills

  • Technical skills :
  • Languages
  • Python, SQL, Pyspark
  • Technologies
  • Azure, Azure Functions, Azure Data Factory, Data Flows
  • Databases
  • SQL, Oracle, Azure SQL
  • RAD / IDE
  • Visual Studio 2015/2017/2019
  • Tools
  • MS Office suite, Git, SSMS

Accomplishments

  • Projects Details
  • Project : pacs: project name : Pacs & pamb client : Prudential
  • Duration : March2023- Till date (17months) role : Senior Data Engineer
  • Description : Data Extraction and Transformation
  • Hive Queries: Develop and execute Hive queries to extract and join data from various sources
  • Consolidate this data into a landing layer for initial staging
  • Encryption Layer
  • PII Protection: Implement encryption protocols to secure Personally Identifiable Information (PII) and ensure compliance with data privacy regulations
  • Decryption and Historical Tracking
  • Data Decryption: Decrypt data as required in the next layer for processing
  • Project 1
  • Project Name : EDAP –Common Ingestion
  • Client : Chevron
  • Environment : PySpark, python, databricks, ADF,Azure Devops
  • Duration : January 2019- April 2021
  • Role : Azure data engineer
  • Objective
  • The goal of the Data Science Technical Delivery Platform ingestion process is to efficiently connect to and ingest data from both on-premises and Azure/external systems of record
  • This data is captured in its original format and landed into the Enterprise Azure Data Lake (ADLS Gen2)
  • The ingestion process generates lifecycle events and captures data provenance information, which is then sent to the Data Science Technical Delivery Platform Orchestrator for further processing via event-based integration
  • Roles and Responsibilities
  • Pipeline Development
  • Azure Data Factory (ADF): Designed and implemented data pipelines to extract data from various sources and load it into Azure Synapse Analytics
  • Data Transformation: Utilized PySpark for data transformation tasks and pushed the processed data into Azure Data Lake Storage (ADLS)
  • Infrastructure and Deployment
  • Azure DevOps: Set up infrastructure, built, and deployed applications using Azure DevOps
  • Incremental Load Strategy: Implemented strategies for daily incremental data loads to ensure efficient and timely updates
  • Testing and Releases
  • Release Management: Managed deployment releases, including unit and integration testing, to ensure quality and functionality
  • Data Factory Management
  • Linked Services and Datasets: Created and managed linked services, datasets, and pipelines within Azure Data Factory
  • Stored Procedures: Developed and optimized stored procedures using T-SQL
  • Activities: Configured and managed copy activities, lookup activities, and metadata activities in ADF
  • Monitoring: Monitored pipelines, identified issues, and implemented fixes as necessary
  • Data Transformation and Workflow
  • Data Flows: Designed and implemented data flows for transforming and moving data to Azure using Azure Data Factory
  • End-to-End Framework: Developed a comprehensive project framework, ensuring timely delivery and alignment with customer requirements
  • Key Achievements
  • Efficient Data Integration: Streamlined the ingestion process from multiple data sources into Azure, ensuring data integrity and availability
  • Robust Transformation Processes: Leveraged PySpark for scalable data transformations, enhancing data processing capabilities
  • Effective Deployment: Successfully managed application deployment and testing, ensuring reliable and smooth operation
  • Proactive Problem-Solving: Demonstrated proactive problem-solving skills by addressing issues promptly and meeting project deadlines
  • Project 2 :
  • Project Name : Data lake Data engineering
  • Client : Communication and Media
  • Environment : Azure data factory, Azure data bricks, Azure sql db, ADLS 2
  • Role : Sr
  • Azure Data engineer
  • The Data Lake Technology Platform represents a modern technological foundation within a secure, hosted ecosystem
  • This platform integrates client data with industry-specific data feeds, leveraging Media's unique capabilities in data analytics and advanced AI
  • It aims to deliver enhanced opportunities throughout the customer lifecycle
  • Roles and Responsibilities
  • ETL Workflow Management
  • Azure ADF & Databricks: Developed and managed ETL workflows using Azure Data Factory (ADF) and Databricks with PySpark
  • This involved extracting data from relational databases and loading it into Azure SQL Database
  • Data Transformation: Extensively transformed data using PySpark and pushed the processed data into Azure Data Lake Storage (ADLS) Gen2
  • Data Storage: Stored transformed data into Azure SQL Database for consumption by Power BI and Spotfire
  • Data Migration
  • On-Prem to Azure: Led data migration projects from on-premises systems to Azure Cloud using Databricks and Spark APIs, ensuring seamless transition and data integrity
  • Daily Operations
  • Scrum Participation: Attended daily scrum meetings and provided updates on Azure DevOps (ADO) user stories
  • Pipeline Monitoring: Monitored pipeline jobs for performance and reliability, promptly addressing and fixing any issues that arose
  • (M.Ram kalyan)

Additional Information

  • SCD Type 2: Apply Slowly Changing Dimension (SCD) Type 2 techniques to track and preserve historical changes in data, maintaining a complete history. 4. Incremental Load and Delta Processing Delta Pipeline: Create and manage a delta pipeline for handling incremental loads. Set up jobs to run at hourly or daily intervals to keep the data current. 5. Data Quality and Governance Quality Rules: Define and enforce data quality rules to ensure data accuracy and integrity throughout the pipeline. Unity Catalog & Lineage: Utilize Unity Catalog for data governance and lineage graphs to monitor data flow, transformations, and quality. Best Practices Documentation: Keep comprehensive documentation of queries, encryption rules, and historical tracking processes. Testing: Implement robust testing strategies for queries and data transformations to ensure performance and accuracy. Monitoring & Alerts: Set up real-time monitoring and alerts for pipeline jobs to quickly address any issues. Version Control: Use version control systems to manage and track changes to queries and rules. Compliance: Regularly review and update encryption practices to meet evolving compliance standards.

Timeline

senior data engineer

Puresoftware technologies pvt Ltd
03.2023 - Current

Azure Data Developer

ACERT IT SOLUTIONS PRIVATE LIMITED
01.2017 - 03.2023

B.Tech - ECE

National institute of technology,Warangal University(NITW)
Malothu RamKalyanNaik