SourceScore
SourceScore VERITAS · verified claim100% confidence

MMLU benchmark introduced in paper: Measuring Massive Multitask Language Understanding (Hendrycks et al., 2020).

Subject
MMLU benchmark
Predicate
introduced_in_paper
Object
Measuring Massive Multitask Language Understanding (Hendrycks et al., 2020)
Primary source · preprint · 2020-09-07
Measuring Massive Multitask Language Understanding arXiv (Hendrycks et al.)
Last verified 2026-05-16 · 2 sources · 428d754e7c651be6View full claim →