FlashAttention-2 introduced in paper: FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning (Dao, 2023).
Predicate
introduced_in_paper
Object
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning (Dao, 2023)
Primary source · preprint · 2023-07-17
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning — arXiv (Tri Dao — Princeton + Stanford)