Job Description
Responsibilities
- Evaluate AI model responses for personalization quality, including grounding, integration, and helpfulness.
- Design and execute multi-turn prompts based on personal context to test AI capabilities.
- Analyze responses for hallucinations, incorrect personalization, and poor inferences.
- Perform side-by-side comparison of model outputs to determine quality and effectiveness.
- Write clear and structured rationales for response evaluations and rankings.
- Extract and verify debug information to ensure proper use of data sources.
- Maintain strict data hygiene and ensure accurate documentation of evaluations.
- Collaborate with cross-functional teams to improve AI model performance.
Requirements
- Strong proficiency in Polish with excellent reading and writing skills.
- Experience in data annotation, AI evaluation, content moderation, or a related ...
Ready to Apply?
Take the next step in your AI career. Submit your application to Crossing Hurdles today.
Submit Application