Job Description

Overview

LILT is building a global network of domain experts to support high-quality AI evaluation across training benchmarking red‑teaming and ongoing model monitoring. We are seeking software engineering and DevOps professionals to contribute expert judgment to human‑in‑the‑loop AI evaluation workflows used by leading enterprises and hyperscalers.

This role is designed for professionals who understand how software systems infrastructure and development practices work in real production environments and who can apply that expertise to evaluate assess and improve multilingual AI systems.

Your contribution of expertise will directly influence multilingual AI model quality safety and deployment readiness.

This role includes two distinct expert tracks based on experience level and scope of responsibility.

Track A: Software Engineering & DevOps AI Rater

Raters execute structured evaluation tasks using clearly defined rubrics and instructio...

Ready to Apply?

Take the next step in your AI career. Submit your application to LILT today.

Submit Application