Company logo hidden

Lead Site Reliability Engineer

Unlock employer Abu Dhabi, United Arab Emirates Posted: 27 Apr 2026

Financial

  • Estimate: $80k - $120k*
  • Zero income tax location

Accessibility

  • Office Only
  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Senior
  • English: Professional

Position

We are seeking a seasoned DevOps & Site Reliability Engineering (SRE) Lead to design, scale, and elevate our cloud infrastructure and observability ecosystem. If you’re passionate about automation, system resilience, and building highly reliable platforms, this role is for you.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Location: Abu Dhabi Emirate, United Arab Emirates
Work Conditions: On-site, Full-time

Key Responsibilities:

  • Architect and deploy scalable, highly available cloud infrastructure
  • Lead SRE best practices to ensure reliability, performance, and scalability
  • Optimize CI/CD pipelines (Jenkins, Argo CD or similar) for seamless deployments
  • Define and track SLOs & SLIs to maintain uptime and service health
  • Build robust observability frameworks (Elastic Stack, Prometheus, Grafana, Dynatrace, New Relic)
  • Manage Kubernetes clusters and Helm charts for efficient orchestration
  • Implement auto-healing systems and proactive monitoring
  • Drive chaos engineering and resilience testing (Chaos Mesh, Litmus, AWS FIS)
  • Collaborate with engineering and product teams to embed reliability into development
  • Maintain clear infrastructure and incident documentation

What We’re Looking For:

  • 8+ years of experience in DevOps/SRE, including leadership in enterprise environments
  • Hands-on experience with AWS, GCP, or Azure
  • Strong expertise in Infrastructure as Code (Terraform, CloudFormation, Ansible)
  • Proven experience in CI/CD, monitoring, and incident response
  • Deep knowledge of observability tools and practices
  • Strong Kubernetes and Helm experience at scale
  • Experience with databases like MySQL, Cassandra, etc.
  • Proficiency in Python, Bash, or Go
  • Experience in BCP/DR planning and capacity management
  • Strong communication, troubleshooting, and documentation skills

Information Security:

  • Adhere to information security and service management policies
  • Ensure confidentiality and integrity of data
  • Participate in security trainings and report incidents as required
Apply Direct

Jobs you might like   View all jobs

About IT System Custom Software Development Company

Company details are hidden. Subscribe to view full company profile.

Ready to apply for this role?

Apply Direct