Job Description

Key Responsibilities

  • Architect for Resilience: Design systems with redundancy, fault tolerance, and graceful degradation.
  • Observability & Monitoring: Implement full-stack observability including monitoring, logging, tracing, and alerting.
  • Automation First: Build workflows to automate deployments, incident response, and routine tasks.
  • Incident Management: Enable blameless postmortems and continuous improvement.
  • Release Planning: Collaborate with DevOps and engineering teams to manage lifecycle work items and release cycles.
  • Global Collaboration: Work in a shared responsibility model with 50 60% overlap with onshore teams for effective communication.

Required Skills & Experience

  • Cloud Platforms: Azure (preferred), AWS (acceptable with upskilling plan)
  • Infrastructure as Code: Terraform, Helm, GitHub Actions
  • Containerization & Orchestration: Docker, Kubernetes, Arg...

Ready to Apply?

Take the next step in your AI career. Submit your application to Hitachi Solutions today.

Submit Application