Grouped-Query Attention (GQA) introduced in paper: GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints (Ainslie et al., 2023).
Subject
Grouped-Query Attention (GQA)
Predicate
introduced_in_paper
Object
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints (Ainslie et al., 2023)
Primary source · preprint · 2023-05-22
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints — arXiv (Ainslie, Lee-Thorp, de Jong, Zemlyanskiy, Lebrón, Sanghai — Google Research)