Post a Job

AI Research Engineer (Model Compression & Quantization)

Unlock employer Dubai, United Arab Emirates Posted: 19 May 2026

Apply Direct

Financial

Estimate: $80k - $120k*
Zero income tax location

Accessibility

Fully Remote
Apply from abroad
Visa Provided

Requirements

Experience: Unspecified
English: Professional

Explore similar roles:

View AI Research Scientist jobs in Dubai · View all AI Research Scientist jobs

Position

Join the company and shape the future of digital finance. As a member of our AI research team, you will drive innovation in model compression and efficient deployment for advanced multimodal AI systems, including large language models (LLMs) and vision-language models (VLMs). Your focus will be on reducing model footprint and computational cost while preserving accuracy, enabling high-performance AI to run efficiently across resource-constrained edge devices.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

You will apply and advance compression techniques such as quantization, knowledge distillation, and pruning to streamline complex multimodal architectures that integrate text, images, and audio. Responsibilities include:

Applying low-bit quantization to reduce model size and inference latency for generative AI models, while maintaining accuracy and output quality.
Leveraging knowledge distillation to transfer capabilities from larger teacher models to smaller student models.
Implementing pruning techniques to remove redundant parameters and attention heads.
Analyzing trade-offs between model efficiency and accuracy; proposing improvements based on empirical findings.
Researching and applying mixed-precision quantization and other advanced compression strategies.
Staying current with the latest research in model compression, documenting methodologies, and authoring technical papers for top-tier conferences.

Qualifications:

A degree in Computer Science or a related field; ideally a PhD in NLP, Machine Learning, or a related area with a strong publication record.
Experience with PyTorch or equivalent deep learning frameworks.
Hands-on experience with model quantization, including Quantization-Aware Training (QAT) and Post-Training Quantization (PTQ).
Research and hands-on experience with knowledge distillation and model pruning.
Solid understanding of neural network architectures and training processes, including transformers.
Familiarity with C++ is a plus.

Language Requirements:

Excellent English communication skills are necessary.

This is an opportunity to collaborate with some of the brightest minds in the fintech space, contributing to the most innovative platform in the industry.

Apply Direct