Summary
Overview
Work History
Education
Skills
Timeline
Generic

Mohammad Raisul Islam

Site Reliability Engineer
Mirpur

Summary

Professional engineer with strong foundation in system reliability and optimization. Known for delivering robust solutions that enhance system performance and reduce downtime. Collaborative team player focused on achieving results and adapting to changing environments. Skilled in automation, incident management, and continuous improvement.

Overview

6
6
years of professional experience
18
18
years of post-secondary education

Work History

Site Reliability Engineer, Team Lead

IQVIA
07.2023 - Current
  • Created a new framework ECR image release management process for smooth deployment between different environments like dev, qa, pre-prod and prod
  • Developed, maintained and re-architected a blackbox monitoring system using Python Robot Framework, enabling proactive detection of issues and reducing customer complaints
  • Developed custom scripts as needed to automate routine tasks, increasing overall team productivity and efficiency.
  • Collaborated with cross-functional teams to develop, test, and deploy scalable software solutions.
  • Improved incident management workflows by creating comprehensive documentation on troubleshooting procedures and common issues resolution steps.
  • Developed and implemented a notification service using the Strategy Pattern to support multiple notification channels including Slack and BigPanda
  • Managed on-call rotations to provide 24/7 support for critical systems when necessary
  • Conducted root-cause analyses after major incidents to identify areas for process improvement or technical enhancement opportunities.
  • Implemented cost-saving measures by optimizing resource utilization across cloud-based infrastructure environments.
  • Contributed to the ongoing refinement of internal processes and procedures within the site reliability engineering discipline through regular reviews, updates, and knowledge sharing activities.
  • Ensured compliance with relevant industry regulations regarding data privacy standards by actively participating in audits assessments.
  • Managed capacity planning efforts to ensure optimal resource allocation based on current demand projections and future growth expectations.
  • Configured AWS VPC Flow Logs to capture detailed information about IP traffic going to and from network interfaces in the VPC
  • Wrote and maintained detailed documentation for systems, processes, and incident responses
  • Applied chaos engineering principles to test system resilience and improve fault tolerance
  • Implemented comprehensive monitoring solutions using Prometheus, Grafana, Data Dog to ensure system uptime and performance
  • Fostered collaboration between development and operations teams through effective communication strategies during project lifecycles.
  • Led cross-functional teams in high-stakes projects, ensuring timely delivery and exceeding quality expectations.
  • Validated infrastructure and application with test cases for new releases from devops team
  • Evaluated new technologies and tools to enhance overall system performance, stability, and security.

Site Reliability Engineer

MyAlice
01.2021 - 06.2023
  • Architected and implemented a high-availability Kubernetes cluster, scaling from single-node to multi-node architecture, resulting in 99.9% uptime
  • Designed modular CI/CD pipelines using GitHub Actions with parallel execution, reducing deployment time to 5 minutes and enabling zero-downtime deployments
  • Engineered a comprehensive AWS infrastructure implementing best practices for networking (VPC, Route53), compute (EC2, EKS), storage (S3, EBS), and security (IAM, WAF)
  • Implemented AWS ALB Ingress Controller with automated SSL/TLS certificate management, achieving 60% reduction in manual certificate operations and standardizing internal service discovery
  • Containerized legacy applications using multi-stage Docker builds, reducing image sizes by 70% and implementing security scanning in CI pipeline by aqua scan trivy
  • Architected Elasticsearch implementation for read-heavy operations, resulting in 95% reduction in API latency and 40% decrease in database load
  • Automated code quality checks using pre-commit hooks, reducing code review cycles by 50% and maintaining 90% test coverage
  • Successfully led zero-downtime migration from AWS to GCP, involving 20+ microservices and 500GB+ of data, completing within planned maintenance window
  • Redesigned infrastructure on GCP using GKE, and Cloud SQL, achieving 30% cost optimization
  • Configured rabbbitmq clustering for high availability and zero data loss
  • Implemented site to site vpn for enterprise clients for different projects

Software Engineer, Backend

MyAlice
06.2019 - 12.2020
  • Designed and implemented a wrapper around Elasticsearch to optimize a read-heavy ticketing service, significantly reducing latency from 5 seconds to 300 milliseconds
  • Migrated read-heavy operations from PostgreSQL to Elasticsearch to leverage its powerful full-text search and near real-time querying capabilities
  • Created data synchronization processes to ensure that Elasticsearch indexes were kept up-to-date with the latest data from PostgreSQL
  • Developed scalable and maintainable code, ensuring long-term stability of the software.
  • Integrated new technologies into existing systems, increasing capabilities and improving overall performance.
  • Designed, developed and deployed chatbots for Prime bank, and Mutual Trust Bank where users can easily check their Account balance, Statement, recharge on their mobile number, and transfer fund to their account and other bank accounts as well

Education

Bachelor of Science and Engineering - Computer Science

Comilla University
03.2013 - 07.2019

Higher Secondary Certificate - undefined

Saint Joseph Higher Secondary School
06.2010 - 04.2012

Secondary School Certificate - undefined

M.D.C Model Institute
01.2008 - 02.2010

Secondary School Certificate - undefined

Little Flowers Preparatory School
01.2000 - 12.2007

Skills

Python

Timeline

Site Reliability Engineer, Team Lead

IQVIA
07.2023 - Current

Site Reliability Engineer

MyAlice
01.2021 - 06.2023

Software Engineer, Backend

MyAlice
06.2019 - 12.2020

Bachelor of Science and Engineering - Computer Science

Comilla University
03.2013 - 07.2019

Higher Secondary Certificate - undefined

Saint Joseph Higher Secondary School
06.2010 - 04.2012

Secondary School Certificate - undefined

M.D.C Model Institute
01.2008 - 02.2010

Secondary School Certificate - undefined

Little Flowers Preparatory School
01.2000 - 12.2007
Mohammad Raisul IslamSite Reliability Engineer