Job Description
A project dedicated to assessing and benchmarking advanced agentic audio models against leading systems. The program's mission is to evaluate and optimize model performance for real-world customer support use cases.
Responsibilities
Create and execute role-play–based evaluation scenarios that simulate realistic customer service interactions across multiple domains, including:
Flight bookings and travel support
- Financial services
Telecommunications and technical support
Contribute to the development of diverse and representative datasets used to assess conversational audio agents.
- Evaluate model performance across a standardized set of qualitative and quantitative metrics.
- Ensure evaluations reflect real customer expectations for clarity, efficiency, and natural conversational flow.
Evaluation Metrics
Model performance is assessed using a combination of conver...
Ready to Apply?
Take the next step in your AI career. Submit your application to Meridial Marketplace, by Invisible today.
Submit Application