Job Description

  • Lead technical response during critical service incidents, ensuring swift recovery and minimal business disruption.
  • Build early-warning and real-time visibility using observability platforms and monitoring data.
  • Develop dashboards, alert thresholds, and recovery indicators for critical services and infrastructure.
  • Conduct structured root-cause analysis and drive permanent corrective actions to prevent recurrence.
  • Collaborate with Network, Platform, and Application teams to strengthen continuity measures and response readiness.
  • Implement automation-driven remediation steps, reducing manual resolution time and repetitive interventions.
  • Maintain and prioritise a continuity backlog focused on recurrence prevention and operational gaps.
  • Reduce false alarms, repeated disruptions, and reactive firefighting through proactive engineering practices.
  • Improve uptime, recovery time, fault prevention, and continuity metrics across S...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to Maersk today.

    Submit Application