Job Description

**Overview**

The **Principal SRE** leads curial initiatives in the team responsible for **durable, high‑quality handling of high severity, customer impacting, incidents across Microsoft M365 Substrate Core services** . As our systems continue to expand in scope and complexity, this role ensures incidents are handled **consistently, predictably, and with clear ownership** , minimizing customer impact while accelerating recovery and organizational learning.

This role combines **leadership** in the reliability domain, **incident command,** and **operational governance** , working in close partnership with Incident Managers (IMs), Service Owners, and executive stakeholders. You will set standards for how Substrate responds to its most severe outages and drive the evolution of incident handling and escalation practices across all production rings and public/sovereign clouds.

Microsoft’s mission is to empower every person and every organization on the plan...

Ready to Apply?

Take the next step in your AI career. Submit your application to Microsoft Corporation today.

Submit Application