Job Description

  • Manage, monitor, and improve application reliability, scalability, and performance.
  • Implement and maintain monitoring, alerting, and observability tools (Dynatrace, Kibana, CloudWatch).
  • Troubleshoot production issues and drive root cause analysis (RCA) for incidents.
  • Automate operational processes using scripting (Python, Shell, or similar).
  • Collaborate with development and DevOps teams to improve CI/CD and infrastructure reliability.
  • Ensure high system uptime through proactive performance tuning and incident management.
  • Work with AWS services (EC2, ECS, EKS, Lambda, S3, CloudWatch, etc.) for deployment and monitoring.
  • Participate in on-call rotation and production support as required.
  • Support Java / Microservices-based environments, ensuring efficient scaling and health monitoring.
  • Maintain documentation for SRE processes, runbooks, and automation workflows.

Skills Required
Monitoring Tools...

Ready to Apply?

Take the next step in your AI career. Submit your application to Pathfinders Global P Ltd today.

Submit Application