Job Description
Key Responsibilities
- Architect for Resilience: Design systems with redundancy, fault tolerance, and graceful degradation.
- Observability & Monitoring: Implement full-stack observability including monitoring, logging, tracing, and alerting.
- Automation First: Build workflows to automate deployments, incident response, and routine tasks.
- Incident Management: Enable blameless postmortems and continuous improvement.
- Release Planning: Collaborate with DevOps and engineering teams to manage lifecycle work items and release cycles.
- Global Collaboration: Work in a shared responsibility model with 50 60% overlap with onshore teams for effective communication.
Required Skills & Experience
- Cloud Platforms: Azure (preferred), AWS (acceptable with upskilling plan)
- Infrastructure as Code: Terraform, Helm, GitHub Actions
- Containerization & Orchestration: Docker, Kubernetes, Arg...
Ready to Apply?
Take the next step in your AI career. Submit your application to Hitachi Solutions today.
Submit Application