Job Description

Altimetrik is a leading digital business enablement company driving innovation through modern engineering and AI-powered solutions. We are seeking two Senior Site Reliability Engineers to own observability, monitoring, and reliability for BNY's platform services and AI agent deployments. You will build SRE dashboards, define SLIs/SLOs, and architect monitoring strategies that cover both traditional microservices and emerging AI/LLM agent pipelines. AI-assisted development is a core expectation of the role. Remote role Advanced English (B2+/C1) What You'll Work On Design and maintain observability dashboards (Grafana, Splunk Dashboard Studio) for platform and AI agent health Define and track SLIs, SLOs, and SLAs for critical transactional and AI-driven flows Instrument AI/LLM agents with metrics, structured logs, and distributed traces Build and optimize alerting strategies (OpsGenie, PagerDuty) with severity-mapped routing Drive MTTD/MTTR reduction through proactive monitoring and blam...

Ready to Apply?

Take the next step in your AI career. Submit your application to Altimetrik today.

Submit Application