Company logo hidden

VLM Engineer

Unlock employer Abu Dhabi, United Arab Emirates Posted: 10 Oct 2025

Financial

  • Estimate: $80k - $120k*
  • Zero income tax location

Accessibility

  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Unspecified
  • English: Professional

Position

Technology Innovation Institute (TII) is a publicly funded research institute based in Abu Dhabi, United Arab Emirates. It is home to a diverse community of leading scientists, engineers, mathematicians, and researchers from across the globe, transforming problems and roadblocks into pioneering research and technology prototypes that help move society ahead. As part of TII’s Artificial Intelligence Research Center, the Extreme-Scale Language Model team is developing and implementing innovative deep learning technologies with applications in Natural Language Processing, Perception, and Vision.

Ready to apply for roles like this?

Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.

Unlock employer & apply directly

Key Responsibilities

  • Conduct comprehensive ablation studies on vision models to assess the impact of various components and configurations, collaborating with researchers to analyze and report on model architectures and settings.
  • Partner with team members to perform data ablation studies, identifying optimal data types and structures for training vision-language models and analyzing the impact of different data inputs on performance.
  • Develop and implement robust evaluation protocols for vision-language models and assess model performance across diverse benchmarks and real-world scenarios.
  • Engage in model training, focusing on integrating large language models (LLMs) with vision models like CLIP.

Technical Skills Required

  • Expertise in machine learning, particularly in vision-language models and LLMs.
  • Strong understanding of model architectures like CLIP and their application in vision-language tasks.
  • Proficiency in distributed training techniques and multi-GPU optimization.
  • Experience with deep learning frameworks (e.g., PyTorch).
  • Strong analytical skills for conducting ablation studies and evaluating model performance.
  • Familiarity with dataset curation and processing for vision and language tasks.

Qualifications

  • PhD degree in deep learning.
  • Proven track record of research and development in vision-language models.
  • Publication record in top-tier conferences is highly desirable.
Apply Direct

Jobs you might like   View all jobs

Ready to apply for this role?

Apply Direct