

Results-driven technical consultant specializing in AWS, ETL pipelines, and big data processing. Proven ability to enhance data accuracy and optimize workflows, delivering tailored solutions for complex client requirements.
Developed ETL pipelines on EMR clusters using PySpark, optimizing data workflows and improving debugging efficiency.
Extracted and processed data from S3 and databases using AWS Glue and PySpark for initial data cleansing.
Utilized Glue Studio for data extraction and transformation across multiple file formats.
Partitioned data into the raw zone in S3 before applying business logic and moving it to the refined zone.
Loaded refined data into Redshift tables to support downstream dashboards and analytical processes.
Created AWS Lambda functions to trigger Glue jobs based on S3 object updates, enabling near real-time processing.
Integrated Amazon CloudWatch with EC2 instances for monitoring and tracking log files.
Configured SNS topics to send notifications to subscribers based on defined requirements.
Analyzed client requirements to develop tailored technical solutions.
Collaborated with cross-functional teams to implement system integrations.
Provided training sessions for clients on new software features.
Managed multiple projects, delivering on schedule through prioritization and resource allocation.
Automated data transfers and integrated storage solutions including FTP, AWS S3, and Parquet, enhancing client satisfaction through proactive communication and issue resolution.
Improved database performance via query optimization, indexing, and regular maintenance.
Developed and maintained enterprise data solutions using SQL and ETL processes.
Optimized SQL performance and enhanced data accuracy through indexing and cleansing techniques.
Developed efficient views and stored procedures to streamline data retrieval processes.
Conducted quality checks to minimize data errors and ensure accurate reporting.
Trained new team members, increasing onboarding speed and overall productivity.
Collected and analyzed travel data from diverse sources for price monitoring. Navigated databases, websites, and APIs to track price fluctuations.
Aggregated varied data sets to provide accurate insights. Utilized advanced tools for efficient data gathering and processing. Handled proxy servers to extract data from multiple websites.
Developed user-defined Python and Java tools for data extraction. Monitored tools and conducted thorough analysis of extracted data.
Completed day-to-day duties accurately and efficiently.
Contributed innovative ideas and solutions to enhance team performance and outcomes.
Worked successfully with diverse group of coworkers to accomplish goals and address issues related to our products and services.
Data engineering: ETL pipelines, data warehousing, data modeling, data lake architecture
Big data processing: Apache Spark, PySpark, AWS Glue, Amazon EMR
Cloud services: AWS, Databricks, Glue, Redshift, S3, Lambda, Athena, EMR, RDS
Databases: Amazon Redshift, Amazon RDS, DynamoDB, SQL, Snowflake
Programming languages: Python, SQL, Shell
Workflow orchestration: Apache Airflow, Step Functions, Control-M
DevOps tools: GitHub, Bitbucket, Jira
Operating systems: Linux, Windows Server