Job Description
**Overview**
Microsoft Azure’s Artificial Intelligence and High Performance Computing (AI/HPC) organization powers some of the world’s largest cloud native supercomputers used for frontier AI training, scientific computing, and large scale distributed simulations. Our team builds and operates hyperscale GPU clusters that consistently place Azure among global leaders in the Top500, MLPerf, and Graph500 benchmarks. By joining us, you step into the engineering core responsible for ensuring these systems remain reliable, performant, and ready for the next wave of AI innovation.
At this scale, interconnect fabrics are a first order reliability system that directly determines GPU availability, training throughput, and customer SLAs. As a Principal Supercomputing Operations Engineering Manager, you own the operational strategy and organizational execution for interconnect fabric reliability across flagship AI supercomputing environments. You lead teams th...
Microsoft Azure’s Artificial Intelligence and High Performance Computing (AI/HPC) organization powers some of the world’s largest cloud native supercomputers used for frontier AI training, scientific computing, and large scale distributed simulations. Our team builds and operates hyperscale GPU clusters that consistently place Azure among global leaders in the Top500, MLPerf, and Graph500 benchmarks. By joining us, you step into the engineering core responsible for ensuring these systems remain reliable, performant, and ready for the next wave of AI innovation.
At this scale, interconnect fabrics are a first order reliability system that directly determines GPU availability, training throughput, and customer SLAs. As a Principal Supercomputing Operations Engineering Manager, you own the operational strategy and organizational execution for interconnect fabric reliability across flagship AI supercomputing environments. You lead teams th...
Ready to Apply?
Take the next step in your AI career. Submit your application to Microsoft Corporation today.
Submit Application