SourceScore
SourceScore VERITAS · verified claim92% confidence

MMLU-Pro benchmark introduced in paper: MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark (Wang et al., 2024).

Subject
MMLU-Pro benchmark
Predicate
introduced_in_paper
Object
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark (Wang et al., 2024)
Primary source · preprint · 2024-06-03
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark arXiv (Yubo Wang et al. — TIGER-Lab)
Last verified 2026-05-31 · 3 sources · 2df92e0b0e4c891bView full claim →