Job Description
<p><b>Job Description</b></p> <p><b>Observability & Monitoring Architect and operate enterprise-scale observability platforms on AWS/Azure, covering microservices, Lambda/serverless functions, and Kubernetes-based workloads. Build full-stack observability solutions using Prometheus, Grafana, CloudWatch, Datadog, Splunk, and Fluentd - covering metrics, logs, and distributed traces. Define and implement SLIs, SLOs, and error budgets; build actionable dashboards and alerting policies aligned to business and technical objectives. Standardize application logging across engineering codebases by designing and championing reusable Python logging modules and log aggregation pipelines. Instrument applications and services with distributed tracing to identify performance bottlenecks and reduce MTTR. Automation & SRE Design and build reusable automation frameworks standardized as team-wide baselines for all operational automation ...
Ready to Apply?
Take the next step in your AI career. Submit your application to Arminus today.
Submit Application