Job Description

Onit, Inc. is looking for a Senior Site Reliability Engineer to join our Core Infrastructure team. This role will help to ensure the reliability of a diverse set of applications across our AWS infrastructure. To be successful in this role you will need to collaborate and pair with team members, have strong technical skills, and a passion for technology. The individual we seek is skilled in observability, excellent at troubleshooting, and has strong problem-solving skills. You must be able to multi-task in a fast-paced environment and be a self-starter with the ability to work independently.

Responsibilities

  • Troubleshoot deployment failures and infrastructure issues across our full AWS infrastructure stack (EKS, RDS,) This incudes dev, test, and production environments.
  • Create and maintain monitors for uptime and performance using Datadog, CloudWatch and other monitoring tools.
  • Find ways to help reduce errors in systems and re...

Ready to Apply?

Take the next step in your AI career. Submit your application to Onit today.

Submit Application