Tag

rlhf

3 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.

Reinforcement Learning from Human Feedback (RLHF) introduced in paper: Deep Reinforcement Learning from Human Preferences (Christiano et al., 2017).
67866330cd60e54d · 3 sources · 100% confidence
InstructGPT methodology introduced in paper: Training language models to follow instructions with human feedback (Ouyang et al., 2022).
5da8f8dffc038b8e · 2 sources · 100% confidence
Proximal Policy Optimization (PPO) introduced in paper: Proximal Policy Optimization Algorithms (Schulman et al., 2017).
00f224e1ccc158ef · 2 sources · 100% confidence

Related tags

foundational2 openai2 20172 alignment2 20221 nips1 christiano1 instructgpt1 ouyang1 ppo1