Job Description
We are seeking a Site Reliability Engineer (SRE) with strong experience in production support, incident management, and change management. The ideal candidate will combine software engineering skills with operational excellence to ensure high availability, reliability, performance, and scalability of mission‑critical systems.
Key Responsibilities
- Production Support & Operations
- Manage and support large‑scale production environments with a focus on uptime, service health, and proactive issue prevention.
- Lead Incident Management processes, including troubleshooting, root cause analysis (RCA), and post‑incident reviews.
- Oversee Change Management activities ensuring minimal service disruption and compliance with operational standards.
- DevOps & Automation
- Build, maintain, and optimize CI/CD pipelines using modern DevOps tools.
- Implement and manage Infrastructure as Code (IaC) using tool...
Ready to Apply?
Take the next step in your AI career. Submit your application to Coforge today.
Submit Application