Job Description

Description
***Role Overview***
Application Operational Services is seeking a Site Reliability Engineer (SRE) to support the production stability, performance, and reliability of critical enterprise applications. This role focuses on end‑to‑end monitoring, observability, incident management, and SRE best practices, with a strong emphasis on Dynatrace‑based application performance monitoring (APM).
The SRE will partner closely with engineering and product teams to ensure systems meet defined service level objectives (SLOs) and deliver a consistent end‑user experience.
***Key Responsibilities***
Support production operations of mission‑critical applications, ensuring availability, performance, and resiliency
Perform full‑stack triage of alerts and incidents using Dynatrace, partnering with engineering teams to identify root cause
Define, track, and improve SLIs, SLOs, and error budgets with product owners and developers
Design and maintain Dynatrace dashboards, me...

Ready to Apply?

Take the next step in your AI career. Submit your application to TEKsystems today.

Submit Application