Install flash-attn (Flash Attention)
This guide is optimized for the common goal: install flash-attn quickly without compiling. Use the wheel finder to get a command that installs directly from a matching wheel URL.
Quick start (prebuilt wheel)
- 1) Open the wheel finder: go to the wheel finder and choose your platform, flash-attn version, Python, PyTorch, and CUDA.
- 2) Copy the command: you’ll get commands like:
pip install https://example.com/flash_attn-...whluv pip install https://example.com/flash_attn-...whl - 3) Verify:
python -c "import flash_attn; print('flash_attn ok')"
If something fails
Most install failures are version mismatches. Use these pages to recover quickly: