flash-attn Versions & Release History
Complete version history for Flash Attention. Find the right version for your Python, PyTorch, and CUDA setup. Download prebuilt wheels for any version using the wheel finder.
Latest
flash-attn 2.8.3
Released January 2026 with Python 3.13 support, PyTorch 2.9 compatibility, and CUDA 12.8 wheels.
Version History
v2.8.3
LatestJanuary 2026Python
3.103.113.123.13
PyTorch
2.42.52.62.72.82.9
CUDA
11.812.112.212.412.612.8
- • Python 3.13 support added
- • PyTorch 2.9 compatibility
- • CUDA 12.8 wheels available
- • Performance improvements for H100/H200
v2.7.4
November 2025Python
3.103.113.12
PyTorch
2.32.42.52.6
CUDA
11.812.112.212.4
- • Improved memory efficiency
- • Better backward pass performance
- • Bug fixes for edge cases
v2.6.3
August 2025Python
3.93.103.113.12
PyTorch
2.12.22.32.4
CUDA
11.812.112.2
- • Stable release with wide compatibility
- • Python 3.9 still supported
- • Good choice for older PyTorch versions
v2.5.9
May 2025Python
3.83.93.103.11
PyTorch
2.02.12.22.3
CUDA
11.711.812.1
- • Last version with Python 3.8 support
- • Compatible with PyTorch 2.0
- • CUDA 11.7 wheels available
Which Version Should I Use?
For new projects
Use flash-attn 2.8.3 with Python 3.11 or 3.12 and PyTorch 2.5+. This gives you the best performance and widest wheel availability.
For PyTorch 2.0-2.3
Use flash-attn 2.5.9 or 2.6.3. These versions maintain compatibility with older PyTorch releases.
For Python 3.8 or 3.9
Use flash-attn 2.5.9 (last version with Python 3.8) or 2.6.3 (last version with Python 3.9).
Find Your Wheel
Use the wheel finder to download prebuilt wheels for any flash-attn version:
Open Wheel Finder →Check Compatibility
Verify your Python, PyTorch, and CUDA versions work together:
Compatibility Checklist →