Job Description
AI Performance Engineer
We are looking for a AI Performance Engineer in Latin America to work on latest large AI model knowledge, deep learning performance optimization and benchmarking on modern GPU-based systems, with a strong focus on MLPerf Training and Inference workloads.
The primary models we work on include Llama 2, Llama 3, DeepSeek, and open-source GPT-style models (GPT-OSS).
This is a hands-on engineering role involving performance profiling, PyTorch optimization, large-scale distributed training, and building reproducible benchmarking environments, in close collaboration with other performance- and systems-focused engineers.
What You Will Do
- Optimize training and inference pipelines for large language models such as Llama 2, Llama 3, DeepSeek, and GPT-OSS
- Work on MLPerf Training and/or Inference benchmarks for LLM workloads
- Profile GPU workloads to identify compute, memory, and communication bottlenecks
- I...
Ready to Apply?
Take the next step in your AI career. Submit your application to Confidential today.
Submit Application