Tether Circular Logo

AI Research Engineer (Model Serving & Inference)

Tether Dubai, United Arab Emirates Posted: 17 Jun 2025

Financial

  • Estimate: $90k - $120k*
  • Zero income tax location

Accessibility

  • Fully Remote
  • Apply from abroad
  • Visa Provided

Requirements

  • Experience: Unspecified
  • English: Professional

Position

Join Tether and Shape the Future of Digital Finance. At Tether, we’re not just building products; we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of blockchain technology, Tether enables the storage, sending, and receiving of digital tokens instantly, securely, and globally, all at a fraction of the cost. Transparency is the bedrock of everything we do, ensuring trust in every transaction.

As a member of our AI model team, you will drive innovation in model serving and inference architectures for advanced AI systems. Your work will focus on optimizing model deployment and inference strategies to deliver highly responsive, efficient, and scalable performance across real-world applications. You will engage with a wide spectrum of systems, from resource-efficient models designed for limited hardware environments to complex, multi-modal architectures that integrate text, images, and audio.

Responsibilities:

  • Design and deploy state-of-the-art model serving architectures that achieve high throughput and low latency while optimizing memory usage.
  • Ensure efficient operation across diverse environments, including resource-constrained devices and edge platforms.
  • Establish performance targets such as reduced latency and improved token response.
  • Build, run, and monitor controlled inference tests in both simulated and live production environments.
  • Track key performance indicators (KPIs) including response latency, throughput, memory consumption, and error rates.
  • Prepare high-quality test datasets and simulation scenarios tailored to real-world deployment challenges.
  • Analyze computational efficiency and diagnose bottlenecks in the serving pipeline.
  • Work closely with cross-functional teams to integrate optimized serving and inference frameworks into production pipelines for edge and on-device applications.

Qualifications:

  • Degree in Computer Science or a related field; ideally, a PhD in NLP, Machine Learning, or a related field.
  • Proven experience in low-level kernel optimizations and inference optimization on mobile devices.
  • Deep understanding of modern model serving architectures and inference optimization techniques.
  • Strong expertise in writing CPU and GPU kernels for mobile devices, along with a thorough understanding of model serving frameworks and engines.
  • Practical experience in developing and deploying end-to-end inference pipelines.
  • Demonstrated ability to apply empirical research for optimizing model serving challenges.

Language Requirements:
Excellent English communication skills are required.

Location: United Arab Emirates (Remote)
Work Conditions: Full-time, Remote

Apply now

Jobs you might like   View all jobs

About Tether

Tether has evolved to meet global needs with agility and vision. Our flagship product, USD₮, became a lifeline during the pandemic, providing financial freedom and stability in emerging markets. We now innovate in Bitcoin mining, P2P communications, education, AI, and neurotechnology. Our sustainable energy investments and support for decentralized communication ensure privacy and efficiency. Committed to democratizing technology and education, Tether empowers people globally to thrive. Join us as we continue to dare, innovate, and dream. Unstoppable Together