Job Description

HME is looking for a Senior Software Engineer to own and operate our OpenSearch-based Observability Platform. This role is responsible for running a highly available, secure, and scalable OpenSearch platform on Azure Kubernetes Service (AKS), acting as the DevOps and SRE owner and enabling application teams to onboard telemetry using Open Telemetry.

What you will do in the position:

  • Own DevOps and SRE responsibilities for the OpenSearch observability platform.
  • Design, deploy, and operate OpenSearch clusters on Kubernetes with focus on high availability, scalability, resiliency, and disaster recovery.
  • Manage cluster sizing, shard strategy, index lifecycle management, retention, and performance tuning.
  • Lead upgrades, capacity planning, and zero / minimal-downtime operations.
  • Collaborate with application teams to integrate Open Telemetry-based logs, metrics, and traces into OpenSearch.
  • B...

Ready to Apply?

Take the next step in your AI career. Submit your application to HME today.

Submit Application