Job Description
NVIDIA is hiring experienced SRE engineers to help scale up its AI Infrastructure. We expect you to have significant experience with site reliability principles and techniques including reliability assessments, incident management processes, production system observability, monitoring and alerting, automated deployments and toil elimination. We view SRE as a software engineering discipline and expect significant contributions to our codebase. We welcome out-of-the-box thinkers who can provide new ideas with strong execution bias. Expect to be constantly challenged, improving, and evolving for the better. You will help advance NVIDIA's capacity to build and deploy leading infrastructure solutions for a broad range of AI-based applications. If you're creative, passionate about SRE, and love having fun, please apply today!
For two decades, we have pioneered visual computing, the art and science of computer graphics. With the invention of the GPU - the engi...
Ready to Apply?
Take the next step in your AI career. Submit your application to Nvidia today.
Submit Application