Simple Preference Optimization (SimPO) introduced in paper: SimPO: Simple Preference Optimization with a Reference-Free Reward (Meng et al., 2024).
Subject
Simple Preference Optimization (SimPO)
Predicate
introduced_in_paper
Object
SimPO: Simple Preference Optimization with a Reference-Free Reward (Meng et al., 2024)
Primary source · preprint · 2024-05-23
SimPO: Simple Preference Optimization with a Reference-Free Reward — arXiv (Meng, Xia, Chen — University of Virginia + Princeton)