Summary
Overview
Work History
Education
Skills
Certification
Interests
Timeline
Generic

MD Arif

Business Intelligence Engineer | Data Analyst | SQL | ETL | Python
Dallas,Texas

Summary


  • Data Engineering professional with solid foundational skills and proven tracks of implementation in a variety of data platforms. Self-motivated with a strong adherence to personal accountability in both individual and team scenarios.
  • Experience in Data Analysis, Data Profiling, Data Integration, Migration, Data governance and Metadata Management, Master Data Management and Configuration Management.

Overview

4
4
years of professional experience
6
6
years of post-secondary education
1
1
Certification

Work History

Graduate Teaching Assistant

Texas A & M University Commerce
Commerce, Texas
01.2021 - Current
  • Checked assignments, proctored tests and provided grades according to university standards.
  • Documented attendance and completed assignments to maintain full class and student records.
  • Taught ETL and SQL college-level courses for over 50 students.
  • Oversaw classes of up to 30 students in Business Intelligence Course.
  • Prepared lessons according to course outline to convey required material and deepen student understanding of subject matter.
  • Led courses independently with minimal oversight from professors.
  • Utilized GitHub and Docker for the runtime environment for the CI/CD system to build, test, and deploy.
  • Consumed the data from Kafka sources and implemented analysis model.

Data Engineer

Cognizant Technology Solution
Hyderabad , Telangana
01.2019 - 11.2020
  • Designed compliance frameworks for multi-site data warehousing efforts to verify conformity with state and federal data security guidelines.
  • Generated detailed studies on potential third-party data handling solutions, verifying compliance with internal needs and stakeholder requirements.
  • Developed, implemented and maintained data analytics protocols, standards and documentation.
  • Analyzed complex data and identified anomalies, trends and risks to provide useful insights to improve internal controls.
  • Designed and built data processing pipelines tools and framework in the Hadoop Ecosystem
  • Worked on interactive guided analytics apps and dashboards, the ER/Dimensional model was implemented in Tableau.
  • Constructed product-usage SDK data and Siebel data aggregations by using PYSPARK, Scala, Spark SQL.
  • Developed ETL tool to load the data from a given source to target using python, PySpark, Sqoop, Unix and hive.
  • Participated in requirements gathering and worked closely with the architect and SME’s in designing and modeling.
  • Handled data ingestions from various data sources, performed transformations using spark, and loaded data into HDFS.
  • Hive context in partitioned Hive external tables maintained in AWS S3 location for reporting, data science dash boarding and ad-hoc analyses.
  • Worked on API development for client apps to query for the current version.
  • Translated a set of requirements and data into a usable database schema by creating or recreating ad hoc queries, scripts and macros, updates existing queries, creates new ones to manipulate data into a master file
  • Experience in dealing with distributed computing using Hadoop and applying various Machine Learning techniques in solving various data related challenges.
  • Worked on applications for cloud readiness changes, Implemented Liquibase application changes.

Data Analyst

Minevesta Infotech
Hyderabad, Telangana
03.2018 - 01.2019
  • Identified and documented detailed business rules and use cases based on requirements analysis.
  • Researched and resolved issues regarding integrity of data flow into databases.
  • Identified, analyzed and interpreted trends or patterns in complex data sets.
  • Analyzed transactions to build logical business intelligence model for real-time reporting needs.
  • Build data pipelines using Hive and Apache Spark to calculate core Marketing metrics
  • Source the data from multiple places to Hadoop cluster
  • Created Power BI reports for different metrics from hive and Big Query as the source data
  • Collect, clean, transform and load user Clickstream data and make it available for downstream pipelines and analyses
  • Create Dataflow pipelines using Spark-Scala
  • Migration of existing Teradata and hive queries to Google Cloud Platform
  • Creating aggregate tables in the data pipeline that will be used by the reporting team to project the metrics to the business users.
  • Schedule the data pipelines in a scheduler like UC4 as per the requirement like weekly, daily.
  • Migration to Google Cloud Platform from traditional Hadoop cluster
  • Ingested the data into data lake from different sources and performed various transformations like sort, join, aggregations, filter to process various datasets.
  • Automated data flow between the software systems using Apache Airflow.
  • Created ETL jobs using Spark to perform data migrations and data loads into HDFS, Hive from different source systems.
  • Implemented Spark jobs for data preprocessing, validation, normalization, and transmission.
  • Configured multiple Spark jobs to obtain efficient run time.

Education

Master of Science - Computer Science And Programming

Texas A & M University
Texas
07.2020 - 07.2022

Bachelor of Engineering - Computer Science

Deccan College of Engineering And Technology
India
06.2009 - 06.2013

Skills

Python, SQL, PL/SQL, Scala

undefined

Certification

Careerara: Data Science Professional Certificate

Interests

Reading Blogs

Writing Blogs

Travelling

Timeline

Careerara: Data Science Professional Certificate

07-2022

Graduate Teaching Assistant

Texas A & M University Commerce
01.2021 - Current

Master of Science - Computer Science And Programming

Texas A & M University
07.2020 - 07.2022

Data Engineer

Cognizant Technology Solution
01.2019 - 11.2020

Data Analyst

Minevesta Infotech
03.2018 - 01.2019

Bachelor of Engineering - Computer Science

Deccan College of Engineering And Technology
06.2009 - 06.2013
MD ArifBusiness Intelligence Engineer | Data Analyst | SQL | ETL | Python