Job Description
Do you have hands-on experience designing reliable evaluations for LLM/NLP features?Do you enjoy turning messy product questions into clear study designs, metrics, and production-ready code?
About our TeamElsevier’s AI Evaluation team designs, builds, and operates NLP/LLM evaluation solutions used across multiple product lines. We partner with Product, Technology, Domain SMEs, and Governance to ensure our AI features are safe, effective, and continuously improving.
About the RoleAs a Senior Data Scientist III, you will design and implement end-to-end evaluation studies and pipelines for AI products. You’ll translate product requirements into statistically sound test designs and metrics, build reproducible Python/SQL pipelines, run analyses and QC, and deliver concise readouts that drive roadmap decisions and risk mitigation. You’ll collaborate closely with SMEs, contribute to our shared evaluation libraries, and produce audit-ready documentation aligne...
About our TeamElsevier’s AI Evaluation team designs, builds, and operates NLP/LLM evaluation solutions used across multiple product lines. We partner with Product, Technology, Domain SMEs, and Governance to ensure our AI features are safe, effective, and continuously improving.
About the RoleAs a Senior Data Scientist III, you will design and implement end-to-end evaluation studies and pipelines for AI products. You’ll translate product requirements into statistically sound test designs and metrics, build reproducible Python/SQL pipelines, run analyses and QC, and deliver concise readouts that drive roadmap decisions and risk mitigation. You’ll collaborate closely with SMEs, contribute to our shared evaluation libraries, and produce audit-ready documentation aligne...
Ready to Apply?
Take the next step in your AI career. Submit your application to RELX INC today.
Submit Application