Job Description

This is a remote position.

Core Expertise
  • SRE Foundations & Practices Deep understanding ofSRE principles(SLIs, SLOs, error budgets, toil reduction, reliability vs. velocity trade-offs).
    Proven experience drivingSRE adoption and culture changeacross teams and applications.
    Strong knowledge ofincident management,on-call practices, andblameless postmortems.
  • Cloud & Infrastructure 5+ years of experience withGoogle Cloud Platform (GCP)services
    Solid expertise withKubernetes , including scaling, workload optimization, network policies, service mesh, and troubleshooting.
    Experience withinfrastructure as code
  • Reliability & Observability Strong knowledge ofmonitoring, logging, and tracing
    Proven ability to design and implementalerting strategiesaligned with SLOs/SLIs.
    Hands-on experience optimizingapplication per...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to iScale Solutions Inc today.

    Submit Application