InstructGPT introduced in: Ouyang et al. 2022 — RLHF-tuned GPT-3, direct ancestor of ChatGPT.
Object
Ouyang et al. 2022 — RLHF-tuned GPT-3, direct ancestor of ChatGPT
Primary source · preprint · 2022-03-04
Training language models to follow instructions with human feedback — arXiv (Ouyang, Wu, Jiang, Almeida, Wainwright, Mishkin, Zhang, Agarwal, et al. / OpenAI)