Job Description

Overview

Senior Site Reliability Engineer (SRE) with Kubernetes and Rancher. Full-time role focused on building and maintaining highly resilient, secure systems, including in air-gapped environments.

Responsibilities

  • System Architecture & Management: Design, architect, and maintain highly reliable, multi-tenant systems using Kubernetes and related tools (RKE2). Includes components such as Ingress, Kong, Artifactory, and Sonar.
  • Observability & Monitoring: Implement and manage observability solutions with Prometheus, Grafana, Splunk, and Elastic to ensure deep visibility into system health and performance, including in air-gapped settings.
  • Compliance & Optimization: Ensure deployments meet stringent compliance standards and are optimized for performance and security.
  • Code Quality & Security: Perform regular code quality analysis and security assessments using Sonar to identify and mitiga...

Ready to Apply?

Take the next step in your AI career. Submit your application to Orion Innovation today.

Submit Application