Job Description

<div> </div> <div> <div><b>Job Title: Platform Site Reliability Engineer (SRE) </b></div> <div><b>For all LATAM</b></div> <div> </div> <div><b> </b></div> <div><b>Position Summary </b></div> <div>We are seeking a Platform Site Reliability Engineer (SRE) to support the reliability, observability, and day-2 operations of modern AI platform environments running performance-sensitive workloads. This role is suited for someone with hands-on experience in production support, monitoring, alerting, incident response, Linux troubleshooting, operational automation, system software maintenance, and GPU-enabled platform operations across infrastructure and platform layers. </div> <div>The ideal candidate has experience with Prometheus, Grafana, and logging/metrics platforms, and can work across compute, platform, DevOps, storage, and network teams to...

Ready to Apply?

Take the next step in your AI career. Submit your application to Kutir Technologies today.

Submit Application