Switch Transformer introduced in paper: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (Fedus et al., 2021).
Subject
Switch Transformer
Predicate
introduced_in_paper
Object
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (Fedus et al., 2021)
Primary source · preprint · 2021-01-11
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity — arXiv (Fedus, Zoph, Shazeer)