Post a Job

Senior AI Research Engineer, Model Inference

Unlock employer Dubai, United Arab Emirates Posted: 23 Sep 2025

Apply Direct

Financial

Estimate: $110k - $150k*
Zero income tax location

Accessibility

Fully Remote
Apply from abroad
Visa Provided

Requirements

Experience: Senior
English: Professional

Explore similar roles:

View AI Research Scientist jobs in Dubai · View all AI Research Scientist jobs

Position

About the Job:
Join Tether and help shape the future of digital finance. As a Senior AI Research Engineer, you will play a crucial role in advancing our AI capabilities. Your primary responsibility will be to extend the inference framework for language models with a strong emphasis on mobile and integrated GPU acceleration (Vulkan). This position requires hands-on experience with quantization techniques, LoRA architectures, and mobile GPU debugging, positioning you at the forefront of next-generation small and large language model (SLM/LLM) inference and fine-tuning performance.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Responsibilities:

Implement and optimize custom inference and fine-tuning kernels for language models across various hardware backends.
Design and extend datatype and precision support for various numerical formats.
Customize and optimize Vulkan compute shaders for quantized operators and fine-tuning workflows.
Investigate and resolve GPU acceleration issues on Vulkan and integrated/mobile GPUs.
Architect support for advanced quantization techniques to enhance efficiency and memory usage.
Conduct GPU testing across desktop and mobile devices and collaborate with research and engineering teams.

Requirements:

Proficiency in C++ and GPU kernel programming.
Expertise in GPU acceleration using Vulkan framework.
Strong background in quantization and mixed-precision model optimization.
Familiarity with large language model architectures (e.g., Qwen, Gemma, LLaMA).
Experience with mobile GPU acceleration and model inference.
Excellent English communication skills.

Work Conditions:

Full-time remote position.

Join us in pioneering solutions that empower businesses globally through innovative technology and a commitment to transparency. If you're ready to contribute to a leading fintech platform, apply today!

Apply Direct