Tag
2 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.
HumanEval benchmark introduced in paper: Evaluating Large Language Models Trained on Code (Chen et al., 2021).
71ec42731d2c9e0c · 2 sources · 100% confidence
Codex introduced in paper: Evaluating Large Language Models Trained on Code (Chen et al., 2021).
79be9b25cd64f250 · 2 sources · 100% confidence