Company logo hidden

VLM Engineer

Unlock employer Dubai, United Arab Emirates Posted: 30 Sep 2025

Financial

  • Estimate: $90k - $130k*
  • Zero income tax location

Accessibility

  • Office Only
  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Senior
  • English: Professional

Position

About the Job:
Technology Innovation Institute (TII) is a publicly funded research institute based in Abu Dhabi, United Arab Emirates. We are home to a diverse community of leading scientists, engineers, mathematicians, and researchers from across the globe, working to transform problems into pioneering research and technology prototypes that advance society.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

As part of TII’s Artificial Intelligence Research Center, the Extreme-Scale Language Model team is developing and implementing innovative deep learning technologies with applications ranging from Natural Language Processing to Perception and Vision. Our team has developed the Falcon models and is focused on advancing cutting-edge applied research in large language models.

Key Responsibilities:

  • Vision Model Ablation Studies: Conduct comprehensive studies to assess the impact of various components and configurations on vision models. Collaborate with researchers to analyze and report on different model architectures and settings.
  • Data Ablation Research: Work with team members to perform studies identifying the optimal data types and structures for training vision-language models, as well as analyze the impact of different data inputs on model performance with an emphasis on vision-language alignment.
  • Model Evaluation: Develop and implement evaluation protocols for vision-language models, assessing performance across diverse benchmarks and real-world scenarios.
  • Model Training and Optimization: Engage in model training, focusing on integrating large language models (LLMs) with vision models like CLIP.

Technical Skills Required:

  • Expertise in machine learning, specifically in vision-language models and LLMs.
  • Strong understanding of model architectures, particularly CLIP, and their application in vision-language tasks.
  • Proficiency in distributed training techniques and multi-GPU optimization.
  • Experience with deep learning frameworks (e.g., PyTorch).
  • Strong analytical skills for conducting ablation studies and model performance evaluation.
  • Familiarity with dataset curation and processing for vision and language tasks.

Qualifications:

  • PhD in deep learning.
  • Proven track record of research and development in vision-language models.
  • A publication record in top-tier conferences is highly desirable.

At TII, we are dedicated to addressing society's biggest challenges through rigorous scientific discovery and inquiry, utilizing state-of-the-art facilities and collaborating with leading international institutions.

Apply Direct

Jobs you might like   View all jobs

Ready to apply for this role?

Apply Direct