Summary

Overview

Work History

Education

Skills

Certification

Interests

Timeline

MD Arif

Business Intelligence Engineer | Data Analyst | SQL | ETL | Python

Dallas,Texas

Summary

Data Engineering professional with solid foundational skills and proven tracks of implementation in a variety of data platforms. Self-motivated with a strong adherence to personal accountability in both individual and team scenarios.
Experience in Data Analysis, Data Profiling, Data Integration, Migration, Data governance and Metadata Management, Master Data Management and Configuration Management.

Overview

years of professional experience

years of post-secondary education

Certification

Work History

Graduate Teaching Assistant

Texas A & M University Commerce

Commerce, Texas

01.2021 - Current

Checked assignments, proctored tests and provided grades according to university standards.
Documented attendance and completed assignments to maintain full class and student records.
Taught ETL and SQL college-level courses for over 50 students.
Oversaw classes of up to 30 students in Business Intelligence Course.
Prepared lessons according to course outline to convey required material and deepen student understanding of subject matter.
Led courses independently with minimal oversight from professors.
Utilized GitHub and Docker for the runtime environment for the CI/CD system to build, test, and deploy.
Consumed the data from Kafka sources and implemented analysis model.

Data Engineer

Cognizant Technology Solution

Hyderabad , Telangana

01.2019 - 11.2020

Designed compliance frameworks for multi-site data warehousing efforts to verify conformity with state and federal data security guidelines.
Generated detailed studies on potential third-party data handling solutions, verifying compliance with internal needs and stakeholder requirements.
Developed, implemented and maintained data analytics protocols, standards and documentation.
Analyzed complex data and identified anomalies, trends and risks to provide useful insights to improve internal controls.
Designed and built data processing pipelines tools and framework in the Hadoop Ecosystem
Worked on interactive guided analytics apps and dashboards, the ER/Dimensional model was implemented in Tableau.
Constructed product-usage SDK data and Siebel data aggregations by using PYSPARK, Scala, Spark SQL.
Developed ETL tool to load the data from a given source to target using python, PySpark, Sqoop, Unix and hive.
Participated in requirements gathering and worked closely with the architect and SME’s in designing and modeling.
Handled data ingestions from various data sources, performed transformations using spark, and loaded data into HDFS.
Hive context in partitioned Hive external tables maintained in AWS S3 location for reporting, data science dash boarding and ad-hoc analyses.
Worked on API development for client apps to query for the current version.
Translated a set of requirements and data into a usable database schema by creating or recreating ad hoc queries, scripts and macros, updates existing queries, creates new ones to manipulate data into a master file
Experience in dealing with distributed computing using Hadoop and applying various Machine Learning techniques in solving various data related challenges.
Worked on applications for cloud readiness changes, Implemented Liquibase application changes.

Data Analyst

Minevesta Infotech

Hyderabad, Telangana

03.2018 - 01.2019

Identified and documented detailed business rules and use cases based on requirements analysis.
Researched and resolved issues regarding integrity of data flow into databases.
Identified, analyzed and interpreted trends or patterns in complex data sets.
Analyzed transactions to build logical business intelligence model for real-time reporting needs.
Build data pipelines using Hive and Apache Spark to calculate core Marketing metrics
Source the data from multiple places to Hadoop cluster
Created Power BI reports for different metrics from hive and Big Query as the source data
Collect, clean, transform and load user Clickstream data and make it available for downstream pipelines and analyses
Create Dataflow pipelines using Spark-Scala
Migration of existing Teradata and hive queries to Google Cloud Platform
Creating aggregate tables in the data pipeline that will be used by the reporting team to project the metrics to the business users.
Schedule the data pipelines in a scheduler like UC4 as per the requirement like weekly, daily.
Migration to Google Cloud Platform from traditional Hadoop cluster
Ingested the data into data lake from different sources and performed various transformations like sort, join, aggregations, filter to process various datasets.
Automated data flow between the software systems using Apache Airflow.
Created ETL jobs using Spark to perform data migrations and data loads into HDFS, Hive from different source systems.
Implemented Spark jobs for data preprocessing, validation, normalization, and transmission.
Configured multiple Spark jobs to obtain efficient run time.

Education

Master of Science - Computer Science And Programming

Texas A & M University

Texas

07.2020 - 07.2022

Bachelor of Engineering - Computer Science

Deccan College of Engineering And Technology

India

06.2009 - 06.2013

Skills

Python, SQL, PL/SQL, Scala

MongoDB, Amazon DynamoDB, HBase

Hadoop, HDFS, Hive, Spark, PySpark, Sqoop, Kafka

Oracle, DB2, Teradata, SQL Server

AWS Glue, Azure Data Factory, GCP, Airflow, Spark, Sqoop, Flume, Apache Kafka, Spark Streaming

Jira, Rally

BitBucket, Git, GitHub

AWS EC2, S3, Lambda, EMR, GCP Big-Query

Certification

Careerara: Data Science Professional Certificate

Interests

Reading Blogs

Writing Blogs

Travelling

Timeline

Careerara: Data Science Professional Certificate

07-2022

Graduate Teaching Assistant

Texas A & M University Commerce

01.2021 - Current

Master of Science - Computer Science And Programming

Texas A & M University

07.2020 - 07.2022

Data Engineer

Cognizant Technology Solution

01.2019 - 11.2020

Data Analyst

Minevesta Infotech

03.2018 - 01.2019

Bachelor of Engineering - Computer Science

Deccan College of Engineering And Technology

06.2009 - 06.2013

MD Arif

Summary

Overview

Work History

Graduate Teaching Assistant

Data Engineer

Data Analyst

Education

Master of Science - Computer Science And Programming

Bachelor of Engineering - Computer Science

Skills

Certification

Interests

Timeline

Graduate Teaching Assistant

Master of Science - Computer Science And Programming

Data Engineer

Data Analyst

Bachelor of Engineering - Computer Science

Similar Profiles

IRFAN AHMED MOHAMMADIRFAN AHMED MOHAMMAD

SHIVANI BHEEMREDDYSHIVANI BHEEMREDDY

RAMSAI NIMMAGADDARAMSAI NIMMAGADDA

HARITULASI GULLAPUDIHARITULASI GULLAPUDI