HumanEval benchmark introduced in paper: Evaluating Large Language Models Trained on Code (Chen et al., 2021).
Subject
HumanEval benchmark
Predicate
introduced_in_paper
Object
Evaluating Large Language Models Trained on Code (Chen et al., 2021)
Primary source · preprint · 2021-07-07
Evaluating Large Language Models Trained on Code — arXiv (Chen et al., OpenAI)