Job Description
We are now looking for a Senior System Software Engineer to work on Dynamo. NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building Generative AI inference platform to make design and deployment of new AI models easier and accessible to all users.
What you'll be doing:
In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will
+ Contribute to the development of disaggregated serving for Dynamo-supported inference engines (vLLM, SGLang, TRT-LLM) and expand to support multi-modal models for embedding disaggregation.
+ Innovate in the management and transfer of large KV caches across heterogeneous memory and storage hierarchies, u...
What you'll be doing:
In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will
+ Contribute to the development of disaggregated serving for Dynamo-supported inference engines (vLLM, SGLang, TRT-LLM) and expand to support multi-modal models for embedding disaggregation.
+ Innovate in the management and transfer of large KV caches across heterogeneous memory and storage hierarchies, u...
Ready to Apply?
Take the next step in your AI career. Submit your application to NVIDIA today.
Submit Application