Job Description

Responsibilities

  • Agent scaffolding: tool use, context management, sandboxing, prompt-injection defence
  • Evals for fuzzy, high-stakes outputs: assessments, policy interpretation, control mapping
  • Reliability infrastructure: retries, fallbacks, circuit breakers, prompt versioning
  • Define the internal standard for what good enough to ship means for AI features in the organization

Qualifications

  • Experience with backend engineering in TypeScript or comparable, with 1–2+ years shipping production LLM features
  • Experience with agent frameworks, tool calling, and multi-step orchestration
  • Production evals chops: dataset curation, LLM‑as‑judge failure modes, regression testing under model swaps
  • Strong systems thinking: async, queues, idempotency
  • Comfort being the named owner of AI quality, including saying no when needed

Nice to have

  • Anthropic, O...

Ready to Apply?

Take the next step in your AI career. Submit your application to HelmGuard Technologies, Inc. today.

Submit Application