Post a Job

Site Reliability Engineer

Unlock employer Riyadh, Saudi Arabia Posted: 01 Jul 2026

Apply Direct

Financial

Estimate: $42k - $60k*
Zero income tax location

Accessibility

Fully Remote
Apply from abroad
Visa Provided

Requirements

Experience: Intermediate
Arabic: Professional

Explore similar roles:

View Site Reliability Engineer jobs in Riyadh · View all Site Reliability Engineer jobs

Position

About
At the company, our platform processes massive volumes of real-time customer data. Any downtime, latency, or instability directly impacts our customers’ ability to make decisions and serve their own users. This role exists to make sure that doesn’t happen. As a Site Reliability Engineer, you’ll sit at the heart of our platform’s stability, owning the reliability of our cloud infrastructure and ensuring it scales seamlessly as we grow. You won’t just react to issues; you’ll anticipate them, design systems that prevent them, and build automation that removes them entirely. If you enjoy solving complex infrastructure challenges, eliminating inefficiencies, and building systems that “just work” - this is where you’ll thrive.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

What You’ll Do
You’ll be responsible for outcomes, not just tasks. Here’s what success looks like in this role:

You’ll make reliability the default
- You’ll design and maintain infrastructure that is highly available, fault-tolerant, and scalable
- You’ll proactively identify and eliminate single points of failure before they become incidents
- You’ll ensure our production systems remain stable, even under increasing scale and load
You’ll own and optimize our cloud environments
- You’ll manage and continuously improve workloads across AWS, GCP, or Azure
- You’ll use Infrastructure as Code (Terraform) to standardize and scale infrastructure
- You’ll optimize resource usage to balance performance and cost
You’ll run and improve Kubernetes in production
- You’ll operate and scale Kubernetes clusters (EKS, GKE, etc.) with confidence
- You’ll troubleshoot issues quickly and ensure smooth deployments and upgrades
- You’ll ensure our containerized workloads perform reliably at scale
You’ll build strong observability and respond to incidents
- You’ll implement and refine monitoring systems using tools like Prometheus, Grafana, Datadog, or ELK
- You’ll define alerting that is meaningful, not noisy
- You’ll respond to incidents, lead root cause analysis, and ensure we learn from every failure
You’ll automate everything that shouldn’t be manual
- You’ll write scripts and build tooling to eliminate repetitive operational work
- You’ll continuously improve infrastructure efficiency through automation
- You’ll promote a culture where manual work is a temporary state, not the norm
You’ll collaborate to improve the entire system
- You’ll work closely with DevOps and engineering teams to solve performance bottlenecks
- You’ll contribute to CI/CD improvements and deployment reliability
- You’ll help shape reliability best practices across the organization

Requirements

You’ve spent ~3 years working in SRE, DevOps, or infrastructure engineering, and you’ve seen what breaks at scale
You’re comfortable working in cloud environments like AWS, GCP, or Azure—and you understand how distributed systems behave
You’ve worked hands-on with Kubernetes in production and know how to troubleshoot it when things go wrong
You don’t just fix issues - you ask why they happened and make sure they don’t happen again

Technically, you likely: