Job Description

Senior AI Research Engineer, Model Inference (Remote)

Join to apply for the Senior AI Research Engineer, Model Inference (Remote) role at Tether.io

Get AI-powered advice on this job and more exclusive features.

About the job

We are looking for an experienced AI Model Engineer with deep expertise in kernel development, model optimization, fine-tuning, and GPU acceleration. The engineer will extend the inference framework to support inference and fine-tuning for Language models with a strong focus on mobile and integrated GPU acceleration (Vulkan).

This role requires hands-on experience with quantization techniques, LoRA architectures, Vulkan backend, and mobile GPU debugging. You will play a critical role in pushing the boundaries of desktop and on-device inference and fine-tuning performance for next-generation SLM/LLMs.

Responsibilities

  • Implement and optimize custom inference and fine-tuning kernels for small an...

Ready to Apply?

Take the next step in your AI career. Submit your application to Tether.io today.

Submit Application