Job Description

Come work at a place where innovation and teamwork come together to support the most exciting missions in the world!

As a Senior Site Reliability Engineer (SRE) , you will be responsible for the reliability, scalability, and observability of our DevOps ecosystem. This includes CI/CD systems, Kubernetes clusters, infrastructure automation, and telemetry platforms. You will work closely with development, QA, and operations teams to build resilient systems and ensure continuous improvement of reliability standards.

Key Responsibilities:

  • Own and manage DevOps components and tooling across 100+ production environments.
  • Administer, scale, and optimize Kubernetes clusters used for application and infrastructure workloads.
  • Implement and maintain observability stacks including Prometheus, OpenTelemetry (OTel), Elasticsearch, and ClickHouse for metrics, tracing, and log analytics.
  • Ensure high avail...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to Qualys today.

    Submit Application