Company logo hidden

Site Reliability Engineer - AI Agents

Unlock employer Dubai, United Arab Emirates Posted: 11 Jun 2026

Financial

  • Estimate: $80k - $120k*
  • Zero income tax location

Accessibility

  • Fully Remote
  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Senior
  • English: Professional

Position

About the Job:
The company, a leading platform in the crypto space, is seeking a Site Reliability Engineer to join their AI Infrastructure team. This team is responsible for building, operating, and scaling the systems that power AI agents in production, ensuring reliability, observability, and scalability of agentic workflows. In this role, you will collaborate with Data Engineering, ML, and product-facing teams to enhance the agent infrastructure and deliver high standards expected by users.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Key Responsibilities:

  • Design, build, and maintain the infrastructure for AI agent workflows in production.
  • Ensure reliability, scalability, and observability of systems across internal and external products.
  • Develop platform services, APIs, SDKs, and self-service capabilities for engineering teams.
  • Manage compute, orchestration, and serving infrastructure for model inference.
  • Implement monitoring, alerting, and incident response for AI/ML workloads.
  • Utilize Infrastructure as Code (IaC) tools like Terraform for cloud infrastructure management.
  • Build and maintain CI/CD pipelines for reliable deployment of AI services.
  • Collaborate to transition experimental prototypes into hardened production systems.
  • Document architecture and best practices for team knowledge sharing.

Requirements:

  • 5+ years of experience as a Site Reliability Engineer, Infrastructure Engineer, or similar role.
  • Hands-on experience supporting ML infrastructure, model serving, or MLOps workflows.
  • Experience building developer platforms, APIs, or internal tooling.
  • Strong understanding of platform engineering principles and developer experience.
  • Proficient with Infrastructure as Code tools, especially Terraform.
  • Experience with containerization and orchestration, primarily Kubernetes and Docker.
  • Strong scripting skills (bash/shell) and proficiency in at least one programming language (Python preferred).
  • Experience with observability, monitoring, and incident response.

Nice to Haves:

  • Experience with agent-based or LLM-powered systems.
  • Familiarity with agent orchestration frameworks.
  • Background in data infrastructure and related technologies.
  • Experience with CI/CD pipelines for AI/ML workloads.
  • Knowledge of Cloudflare's cloud platform and its ecosystem.

Language Requirements:
Not explicitly stated, but proficiency in English is typically expected in technical roles.

Diversity and Inclusion:
The company is an equal opportunity employer and values diverse talents and backgrounds. The company encourages applicants from various backgrounds and respects the needs of all candidates.

Application Process:
Applications are accepted on an ongoing basis. Candidates may be asked to complete job-related skills assessments as part of the hiring process.

Apply Direct

Jobs you might like   View all jobs

Ready to apply for this role?

Apply Direct