Job Description

 Responsibilities

  • Increase observability of applications, services, and infrastructure using tools such as OpenTelemetry, Grafana ecosystem (Grafana, Loki, Mimir, Tempo), and Fluentd.
  • Automate infrastructure and application management using Terraform, Kubernetes, and Puppet.
  • Build and maintain CI/CD pipelines using GitLab, ArgoCD, and Kustomize.
  • Collaborate with product teams to define Service Level Objectives (SLOs) and monitor user experience.
  • Participate in incident, problem, and change management programs to minimize service disruption.
  • Investigate complex system issues, perform root cause analysis, and implement long-term solutions.
  • Apply infrastructure-as-code practices to ensure reproducible and scalable deployments.
  • Continuously learn new technologies and share knowledge with the team.
  • Proactively improve systems and processes to prevent future failures and protect the customer experi...

Ready to Apply?

Take the next step in your AI career. Submit your application to Bottom Line today.

Submit Application