Job Description

The ideal candidate will drive automation, improve observability, and strengthen operational resilience across production environments.

Responsibilities:

  • Design and operate scalable, highly available systems on AWS.
  • Manage Kubernetes clusters and containerized workloads.
  • Build and maintain CI/CD pipelines and deployment automation.
  • Implement Infrastructure as Code (Terraform, CloudFormation, etc.).
  • Enhance monitoring, logging, and distributed tracing frameworks.
  • Define and manage SLIs/SLOs to improve system reliability.
  • Lead incident triage, root cause analysis, and post-incident reviews.
  • Collaborate with engineering and infrastructure teams to optimize system performance and security.

Requirements:

  • 4–7+ years in SRE, DevOps, or Cloud Engineering roles.
  • Strong AWS and cloud-native architecture experience.
  • Hands‑on expertise with Kuberne...

Ready to Apply?

Take the next step in your AI career. Submit your application to Systems Limited today.

Submit Application