Install flash-attn (Flash Attention)

This guide is optimized for the common goal: install flash-attn quickly without compiling. Use the wheel finder to get a command that installs directly from a matching wheel URL.

Recommended

Install from a prebuilt wheel (no compile).

Install with prebuilt wheels →

If you prefer uv

Same wheel install, faster resolver.

Install with uv →

Fallback

Build from source when no wheel matches.

Install from source →

Quick start (prebuilt wheel)

  1. 1) Open the wheel finder: go to the wheel finder and choose your platform, flash-attn version, Python, PyTorch, and CUDA.
  2. 2) Copy the command: you’ll get commands like:
    pip install https://example.com/flash_attn-...whluv pip install https://example.com/flash_attn-...whl
  3. 3) Verify:
    python -c "import flash_attn; print('flash_attn ok')"

If something fails

Most install failures are version mismatches. Use these pages to recover quickly: