Company logo hidden

Platform Engineer

Unlock employer Dubai, United Arab Emirates Posted: 02 Feb 2026

Financial

  • Estimate: $80k - $110k*
  • Zero income tax location

Accessibility

  • Office Only
  • No Relocation Support
  • Visa Provided

Requirements

  • Experience: Senior
  • English: Professional

Position

The Platform Engineer is a platform specialist responsible for architecting, building, and operating high-performance AI infrastructure to support advanced AI workloads, including LLMs, GenAI, Computer Vision, and MLOps. This role will focus on managing GPU clusters (NVIDIA A100/H100), deploying and maintaining Red Hat OpenShift AI (RHODS), and ensuring secure, scalable, and cost-efficient AI platforms across the company's Sovereign Cloud and hybrid/multi-cloud environments. The engineer will enable enterprise-grade AI adoption for over 200 government entities.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Key Responsibilities:

  • GPU & AI Platform Architecture: Design and implement GPU-based compute clusters. Define reference architectures for LLM hosting, Vector Databases, MLOps, and high-performance storage/networking.

    • Deliverables: Fully operational GPU-based AI infrastructure, GPU Cluster Uptime and Performance Utilization, Reduction in Cost per Training/Inference Workload.
  • GPU Cluster Operations: Install, configure, and optimize core components: CUDA, cuDNN, NCCL, NVIDIA Drivers, and GPU Operators. Implement GPU partitioning, scheduling, and performance tuning.

    • Deliverables: High-availability architecture for all AI workloads, complete documentation, and runbooks.
  • OpenShift AI (RHODS) Management: Deploy, configure, and maintain the Red Hat OpenShift AI (RHODS) platform for multi-tenant use.

    • Deliverables: Production-ready OpenShift AI (RHODS) platform, AI Project Onboarding Speed.
  • LLM & Model Serving: Build and manage infrastructure for hosting and serving open-source LLM frameworks and supporting RAG pipelines, LoRA adapters, and Vector Databases.

    • Deliverables: Multi-model LLM serving environment for entities, MLOps Pipeline Success Rate and Deployment Frequency.
  • MLOps & Automation: Implement Infrastructure as Code (IaC) and GitOps for the automated lifecycle management of the AI platform.

    • Deliverables: Infrastructure automation via Terraform & Ansible, Automation Coverage for AI Infrastructure.

Required Qualifications & Experience:

  • 7–12 years in Cloud Infrastructure, DevOps, ML Infrastructure, or Platform Engineering.
  • Deep Hands-On Expertise with GPU Systems (NVIDIA A100/H100), Linux, Containers, and Kubernetes.
  • Experience with OpenShift AI (RHODS) or equivalent Kubernetes GPU orchestration.
  • Familiarity with LLM Hosting and supporting Vector Databases.

Essential Skills & Competencies:

  • Technical: Deep understanding of GPU compute, HPC architectures, and ML performance profiling.
  • Soft Skills: Strong troubleshooting, optimization, and performance engineering mindset. Excellent cross-functional collaboration and documentation skills.

Preferred Certifications:

  • NVIDIA Deep Learning / AI Infrastructure Certification
  • Red Hat OpenShift AI specialization
  • Kubernetes CKA/CKAD
  • Azure AI or Oracle Cloud AI certifications
  • Terraform & Ansible certifications

Work Conditions:

  • Full-time, on-site position.
Apply Direct

Jobs you might like   View all jobs

About IT Services and IT Consulting Company

Company details are hidden. Subscribe to view full company profile.

Ready to apply for this role?

Apply Direct