AI71 Circular Logo

Senior Site Reliability Engineer

AI71 Abu Dhabi, United Arab Emirates Posted: 20 Mar 2025

Financial

  • Estimate: $80k - $120k*
  • Zero income tax location

Accessibility

  • Office Only
  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Senior
  • English: Professional

Position

AI71 is an applied research team dedicated to creating helpful and responsible AI agents for knowledge workers. Working closely with our industry partners, our cross-functional teams of AI experts build products grounded in the cutting-edge research of our colleagues from the Technology Innovation Institute (TII).

We are seeking a highly motivated and skilled DevOps/Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a passion for building, deploying, and maintaining scalable, reliable systems and infrastructure. You will work closely with development teams, ensuring smooth deployment pipelines, system stability, and operational efficiency.

Location: Abu Dhabi, Abu Dhabi Emirate, United Arab Emirates Work Conditions: On-site, Full-time

Key Responsibilities:

  • Infrastructure Automation & Management

    • Design, implement, and maintain CI/CD pipelines to streamline development workflows.
    • Design and implement scalable infrastructure for AI model deployment and management.
    • Automate infrastructure provisioning and management using tools like Terraform, Ansible, or CloudFormation.
    • Optimize cloud-based and on-premises resources to improve system scalability and cost efficiency.
    • Manage and optimize queuing systems and real-time streaming architectures.
  • System Reliability & Monitoring

    • Monitor and troubleshoot production systems to maintain uptime and performance.
    • Implement robust logging and alerting solutions using tools like Prometheus, Grafana, ELK stack, or similar.
    • Conduct root cause analyses and post-mortem reviews to improve system reliability.
  • Collaboration & Support

    • Work with development and QA teams to integrate new features into production environments seamlessly.
    • Advocate for best practices in system architecture, security, and performance optimization.
    • Provide on-call support for critical production systems as part of a rotation schedule.
  • Security & Compliance

    • Ensure infrastructure meets security and compliance requirements (e.g., SOC2, ISO27001).
    • Manage secrets and credentials securely using tools like Vault or AWS Secrets Manager.

Required Qualifications:

  • Bachelor’s degree in computer science, Engineering, or a related field (or equivalent experience).
  • Strong proficiency in at least one scripting language (e.g., Python, Bash, or Go).
  • Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
  • Proficiency with containerization and orchestration tools (Docker, Kubernetes).
  • Experience with CI/CD tools such as AzureDevOps, Jenkins, GitLab CI/CD, or CircleCI.
  • Knowledge of monitoring and observability tools (e.g., Prometheus, Datadog, or New Relic, Grafana, PagerDuty).
  • Understanding of networking concepts (DNS, load balancing, firewalls).
  • Understanding of streaming architectures for real-time AI applications.

Preferred Qualifications:

  • Experience with Infrastructure as Code (IaC) tools like Terraform or Pulumi.
  • Knowledge of service mesh technologies (e.g., Istio, Linkerd).
  • Familiarity with database administration and scaling (VectorDBs, SQL and NoSQL).
  • Previous experience in a similar role in a high-traffic production environment.

Why Join Us?

  • Opportunity to work on cutting-edge technology and challenging problems.
  • Collaborative work environment that values innovation and growth.
  • Competitive salary, benefits, and learning opportunities.
Apply now

Jobs you might like   View all jobs

About AI71

AI71, a pioneering AI company launched by Abu Dhabi's Advanced Technology Research Council (ATRC) and VentureOne, stands as a pivotal movement in the realm of AI innovation. Leveraging the globally top-ranked Falcon AI models from the Technology Innovation Institute, AI71's focus spans across multi-domain advancements, initially targeting the medical, education, and legal sectors. With a commitment to decentralizing data ownership, AI71 sets new standards in privacy and security, offering enterprises and government complete control over their data. Through strategic partnerships, AI71 aims to redefine accessibility to AI, ushering in a new era for the UAE's knowledge economy and positioning the nation as a leading contender on the global AI stage.