Company logo hidden

Site Reliability Engineer

Unlock employer Abu Dhabi, United Arab Emirates Posted: 23 Jan 2026

Financial

  • Estimate: $90k - $120k*
  • Zero income tax location

Accessibility

  • Office Only
  • Visa Provided

Requirements

  • Experience: Senior
  • English: Professional

Position

The company is seeking a seasoned DevOps & Site Reliability Engineer (SRE) Lead to design, scale, and enhance our cloud infrastructure and observability ecosystem. Ideal candidates are passionate about automation, resilience, and reliability.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Key Responsibilities:

  • Architect and deploy scalable, highly available cloud infrastructure for production workloads.
  • Lead and implement SRE best practices, ensuring system reliability, performance, and scalability.
  • Oversee and optimize CI/CD pipelines (Jenkins, Argo CD or similar) for seamless deployments.
  • Define and monitor SLOs & SLIs to ensure service reliability and uptime.
  • Design and manage observability frameworks — monitoring, logging, and alerting (Elastic Stack, Prometheus, Grafana, Dynatrace, New Relic).
  • Manage and optimize Kubernetes clusters and Helm charts for efficient orchestration and streamlined releases.
  • Implement auto-healing and proactive monitoring systems to prevent outages.
  • Drive fault injection testing & chaos engineering (Chaos Mesh, Litmus, AWS FIS) for resilience validation.
  • Collaborate with engineering and product teams to embed reliability into every phase of development.
  • Maintain clear documentation on infrastructure, incidents, and operational processes.

Requirements:

  • 8+ years of experience as a DevOps/SRE professional, leading enterprise SRE implementations.
  • Hands-on experience with AWS, GCP, or Azure (EC2, S3, RDS, Lambda, etc.).
  • Proficient with IaC tools (Terraform, CloudFormation, Ansible).
  • Proven experience in CI/CD automation, monitoring, and incident response.
  • Skilled in observability tools — Elastic Stack, Grafana, Prometheus, Dynatrace, New Relic.
  • Strong expertise in Kubernetes & Helm for large-scale deployments.
  • Experience with AWS managed & self-managed databases (MySQL, Cassandra, etc.).
  • Proficient in scripting languages such as Python, Bash, or Go.
  • Experience designing and testing BCP/DR strategies.
  • Proactive in capacity planning to ensure scalability and resilience across cloud environments.
  • Excellent communication, documentation, and troubleshooting skills.

Information Security Responsibilities:

  • Comply with the company's Information Security & Service Management policies.
  • Maintain the confidentiality and integrity of all information assets.
  • Attend mandatory information security trainings.
  • Report any security incidents through official channels.

Language Requirements:
(Not specified, ensure relevant language skills are assessed during the application process.)

Location: Abu Dhabi Emirate, United Arab Emirates
Work Conditions: On-site, Full-time

Apply Direct

Jobs you might like   View all jobs

About IT System Custom Software Development Company

Company details are hidden. Subscribe to view full company profile.

Ready to apply for this role?

Apply Direct