Job Description

Description

We are seeking a talented and motivated Site Reliability Engineer (SRE) to join our Organization.

The SRE will play a crucial role in ensuring the Reliability, Scalability, Capacity Planning and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, Containerisation and cloud technologies.

Technologies

  • CI/CD, Jenkins, Docker, Kubernetes, Terraform, Ansible, Python, Prometheus, Grafana, ELK stack, Splunk, Dynatrace, Datadog or similar, SLI, SLO, SLA and Error Budget concepts
  • Responsibilities

  • Design, implement, and manage scalable, reliable, and secure cloud infrastructure using tools such as Terraform, Kubernetes, and Docker
  • Develop and maintain monitoring and alerting systems to ensure the health and performance of applications and inf...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to Epam today.

    Submit Application