Tag

stanford

2 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.

Direct Preference Optimization (DPO) introduced in paper: Direct Preference Optimization: Your Language Model is Secretly a Reward Model (Rafailov et al., 2023).
a3e691683a4577af · 2 sources · 100% confidence
FlashAttention introduced in paper: FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness (Dao et al., 2022).
e120182d1e01ea2b · 2 sources · 100% confidence

Related tags

foundational1 20231 20221 nips1 alignment1 dao1 dpo1 flash-attention1 performance1 rafailov1