flash-attn Versions & Release History

Complete version history for Flash Attention. Find the right version for your Python, PyTorch, and CUDA setup. Download prebuilt wheels for any version using the wheel finder.

Latest

flash-attn 2.8.3

Released January 2026 with Python 3.13 support, PyTorch 2.9 compatibility, and CUDA 12.8 wheels.

Version History

v2.8.3

LatestJanuary 2026

Python

3.103.113.123.13

PyTorch

2.42.52.62.72.82.9

CUDA

11.812.112.212.412.612.8
  • Python 3.13 support added
  • PyTorch 2.9 compatibility
  • CUDA 12.8 wheels available
  • Performance improvements for H100/H200

v2.7.4

November 2025

Python

3.103.113.12

PyTorch

2.32.42.52.6

CUDA

11.812.112.212.4
  • Improved memory efficiency
  • Better backward pass performance
  • Bug fixes for edge cases

v2.6.3

August 2025

Python

3.93.103.113.12

PyTorch

2.12.22.32.4

CUDA

11.812.112.2
  • Stable release with wide compatibility
  • Python 3.9 still supported
  • Good choice for older PyTorch versions

v2.5.9

May 2025

Python

3.83.93.103.11

PyTorch

2.02.12.22.3

CUDA

11.711.812.1
  • Last version with Python 3.8 support
  • Compatible with PyTorch 2.0
  • CUDA 11.7 wheels available

Which Version Should I Use?

For new projects

Use flash-attn 2.8.3 with Python 3.11 or 3.12 and PyTorch 2.5+. This gives you the best performance and widest wheel availability.

For PyTorch 2.0-2.3

Use flash-attn 2.5.9 or 2.6.3. These versions maintain compatibility with older PyTorch releases.

For Python 3.8 or 3.9

Use flash-attn 2.5.9 (last version with Python 3.8) or 2.6.3 (last version with Python 3.9).

Find Your Wheel

Use the wheel finder to download prebuilt wheels for any flash-attn version:

Open Wheel Finder →

Check Compatibility

Verify your Python, PyTorch, and CUDA versions work together:

Compatibility Checklist →