Job Description
Site Reliability Engineers (SREs) are essential to PandaDoc's success, ensuring customers receive a reliable service with minimal downtime.
The SRE team achieves this by:
Owning the incident management processes and tools.Managing the observability stack and alerting systems to enable timely investigation and mitigation.Actively contributing to service codebases to proactively prevent incidents and resolve performance bottlenecks.In essence, SREs are the cornerstone of production service resiliency, driving efforts in observability, incident management, capacity planning, and maintaining reliable operations.
In this role, you will:
Own and influence the incident management process end-to-endMaintain and evolve on-prem observability stackKeep production applications running smoothly by participating in the on-call rotationDevelop automations and tools to support platform reliability
Ready to Apply?
Take the next step in your AI career. Submit your application to PandaDoc today.
Submit Application