Job Description

Your Role

You will work as a core member of the Operations team, bringing your professional expertise to ensure reliability and stability of our business-critical cloud solutions. As an SRE, you will design and operate scalable infrastructure, implement monitoring and observability solutions, and drive operational excellence through systematic improvement and automation.

Key Responsibilities

  • Define and Maintain Service Reliability: Take ownership of defining, implementing, tracking, and improving SLIs, SLOs, and error budgets. Escalate and drive resolution when thresholds are exceeded.
  • Reduce Toil and Automate Operations: Identify and eliminate repetitive manual tasks through automation. This includes automating standard operational tasks (key data management, infrastructure component updates, report creation), system management (restarts and certificate renewals), and CI/CD pipeline optimization and deployment processes.
  • Maintain Environments: Ens...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to ZEISS Group today.

    Submit Application