Job Description
This role is responsible for ensuring the continuous health, performance, and reliability of cloud-based and external-facing applications through proactive monitoring, incident response, and operational excellence.
The position involves managing alerts, performing deployments across multi‑cloud environments, handling SSL/TLS certificate lifecycles, and conducting initial troubleshooting with timely escalations based on established runbooks.
This role also contributes to problem management by analyzing incidents, identifying root causes, and implementing preventive measures while generating performance reports for leadership insights.
Success in this role requires strong scripting and automation capabilities, adherence to security and compliance standards, and effective coordination during incident bridge calls and NOC communications.
Operating in a 24/7 rotational shift environment, the role demands strong analytical skills, deep underst...
Ready to Apply?
Take the next step in your AI career. Submit your application to First Advantage today.
Submit Application