Senior Site Reliability Engineer

NVIDIA

📍 Bengaluru, Karnataka, India

Full time Engineers Posted February 27, 2026

Apply Now Similar Jobs

Job Description

Site Reliability Engineering (SRE) is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized domain which demands knowledge across systems, networking, coding, database, capacity management, continuous delivery and deployment, and opensource cloud enabling technologies like Kubernetes and OpenStack. The SRE team at NVIDIA ensures that our internal and external facing GPU cloud services have reliability and uptime as promised to the users, and at the same time enabling developers to make changes to the existing system through careful preparation and planning while keeping an eye on capacity, latency and performance. SRE is also a mindset and a set of engineering approaches to running better production systems and optimizations. Much of our software development focuses on eliminating manual work through automa...
                
Ready to Apply?Take the next step in your AI career. Submit your application to NVIDIA today.
Submit Application

Job Details

Location

Bengaluru, Karnataka, India

Job Type

Full time