Job Description
ROLE & RESPONSIBILITIES:
Own and improve operational reliability and availability of critical applications across their lifecycle.
Design, implement and maintain automation for infrastructure provisioning and deployments (IaC).
Build and optimize CI/CD pipelines to enable fast, reliable and repeatable releases.
Develop monitoring, alerting and observability solutions to detect and prevent incidents.
Lead incident response for escalated production issues and drive root cause analysis and remediation.
Implement and enforce operational best practices, runbooks and playbooks for the team.
Collaborate closely with development teams to improve observability, testability and deployability of
applications.
Drive performance tuning, capacity planning and availability engineering activities.
Plan and execute upgrades, migrations and infrastructure improvements with minimal downtime.
Ensure security, compliance and cer...
Own and improve operational reliability and availability of critical applications across their lifecycle.
Design, implement and maintain automation for infrastructure provisioning and deployments (IaC).
Build and optimize CI/CD pipelines to enable fast, reliable and repeatable releases.
Develop monitoring, alerting and observability solutions to detect and prevent incidents.
Lead incident response for escalated production issues and drive root cause analysis and remediation.
Implement and enforce operational best practices, runbooks and playbooks for the team.
Collaborate closely with development teams to improve observability, testability and deployability of
applications.
Drive performance tuning, capacity planning and availability engineering activities.
Plan and execute upgrades, migrations and infrastructure improvements with minimal downtime.
Ensure security, compliance and cer...
Ready to Apply?
Take the next step in your AI career. Submit your application to Imizizi today.
Submit Application