Job Description
SRE will play a pivotal role in driving the modernization of IT operations by implementing observability practices and automating toil. This position requires a deep understanding of Site Reliability Engineering (SRE) principles, modern observability tools, and automation techniques to ensure scalability, reliability, and efficiency in IT systems. This role requires a strategic thinker with hands‑on expertise who can lead modernization efforts while fostering a culture of reliability and innovation, Work closely with Product Engineering team and implement strategies for modernizing IT operations enhancing observability and toil reduction.
Responsibilities
- Architect and deploy observability platforms to monitor system health, performance, and reliability effectively.
- Propose & drive strategies for AI‑driven alerting and proactive anomaly detection to reduce MTTD & MTTR.
- Develop and enforce SRE best practices, including Service Level Obje...
Ready to Apply?
Take the next step in your AI career. Submit your application to Infoplus Technologies UK Ltd today.
Submit Application