LLM Inference Frameworks and Optimization Engineer

Together AI

📍 Singapore, Singapore, Singapore

Full-time Other-General Posted March 02, 2026

Apply Now Similar Jobs

Job Description

About the Role 
At , we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks, algorithms, and infrastructure, pushing the boundaries of performance, scalability, and cost-efficiency. 
We are seeking an Inference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines that support multimodal and language models at scale. This role will focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design, ensuring efficient large-scale deployment of LLMs and vision models. 
This role offers a unique opportunity to shape the future of LLM inference infrastructure, ensuring scalable, high-performance AI deployment across a diverse range of applications. If you're passionate about pushing the boundaries of AI inference, we'd love to hear from you 
Responsib...
                

Ready to Apply?

Take the next step in your AI career. Submit your application to Together AI today.

Submit Application

Job Details

Location

Singapore, Singapore, Singapore

Job Type

Full-time