Post a Job

VLM Engineer

Unlock employer Dubai, United Arab Emirates Posted: 11 Nov 2025

Apply Direct

Financial

Estimate: $80k - $120k*
Zero income tax location

Accessibility

Office Only
Apply from abroad
Visa Provided

Requirements

Experience: Unspecified
English: Professional

Explore similar roles:

View Machine Learning Engineer jobs in Dubai · View all Machine Learning Engineer jobs

Position

The company is a publicly funded research institute based in Abu Dhabi, United Arab Emirates. It fosters a diverse community of leading scientists, engineers, mathematicians, and researchers from around the globe, focusing on transforming problems into pioneering research and technology prototypes that advance societal progress.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

As part of the company's Artificial Intelligence Research Center, the Extreme-Scale Language Model team is dedicated to developing and implementing cutting-edge deep learning technologies applicable across various domains, including Natural Language Processing, Perception, and Vision. The team is known for its work on the Falcon models and aims to continue advancing applied research in large language models.

Key Responsibilities:

Vision Model Ablation Studies: Conduct comprehensive studies on vision models, assessing the impact of various components and configurations alongside researchers.
Data Ablation Research: Collaborate with team members to perform studies that identify optimal data types and structures for training vision-language models, analyzing the effect of different data inputs on model performance.
Model Evaluation: Develop and implement robust evaluation protocols for vision-language models, assessing their performance across diverse benchmarks and real-world scenarios.
Model Training and Optimization: Engage in model training, focusing on integrating LLMs with vision models like CLIP.

Technical Skills Required:

Expertise in machine learning, particularly in vision-language models and LLMs.
Strong understanding of model architectures like CLIP and their applications in vision-language tasks.
Proficiency in distributed training techniques and multi-GPU optimization.
Experience with deep learning frameworks (e.g., PyTorch).
Strong analytical skills for conducting ablation studies and evaluating model performance.
Familiarity with dataset curation and processing for vision and language tasks.

Qualifications:

PhD degree in deep learning.
Proven track record of research and development in vision-language models.
A publication record in top-tier conferences is highly desirable.

At the company, we are committed to helping society overcome its biggest challenges through rigorous scientific discovery and collaboration with leading international institutions. Our approach aims to forge new and disruptive breakthroughs in various fields including AI, advanced materials, autonomous robotics, cryptography, digital security, directed energy, quantum computing, and secure systems.

Apply Direct