Job Description

A project dedicated to assessing and benchmarking advanced agentic audio models against leading systems. The program's mission is to evaluate and optimize model performance for real-world customer support use cases.

Responsibilities

  • Create and execute role-play–based evaluation scenarios that simulate realistic customer service interactions across multiple domains, including:

  • Flight bookings and travel support

  • Financial services
  • Telecommunications and technical support

  • Contribute to the development of diverse and representative datasets used to assess conversational audio agents.

  • Evaluate model performance across a standardized set of qualitative and quantitative metrics.
  • Ensure evaluations reflect real customer expectations for clarity, efficiency, and natural conversational flow.

Evaluation Metrics

Model performance is assessed using a combination of conver...

Ready to Apply?

Take the next step in your AI career. Submit your application to Meridial Marketplace, by Invisible today.

Submit Application