Job Description

IT Site Reliability & Performance Engineer-18411

Description

The Site Reliability and Performance Engineer designs implement and maintains monitoring and observability solutions and supports the performance and scalability of IT infrastructure and applications. This role partners closely with infrastructure, operations, development, and security teams to help ensure IT services and systems remain available, reliable, and high performing.

What you will be responsible for: 

  • Develop and maintain the organization’s monitoring and observability strategy, standards, and best practices.
  • Design, deploy, and manage monitoring platforms and related tools.
  • Collect, analyze, and visualize metrics, logs, traces, and events to deliver end-to-end observability across infrastructure and applications.
  • Build and maintain dashboards, alerts, and reports for infrastructure, operations, development, and security teams.
  • Improve...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to AAR today.

    Submit Application