Install flash-attn (Flash Attention)

This guide is optimized for the common goal: install flash-attn quickly without compiling. Use the wheel finder to get a command that installs directly from a matching wheel URL.

Recommended

Install from a prebuilt wheel (no compile).

Prebuilt wheels →

Windows

Windows-specific install guide.

Windows install →

Using uv

Faster install with uv package manager.

Install with uv →

From Source

Build when no wheel matches.

Compile from source →

Quick start (prebuilt wheel)

  1. 1) Open the wheel finder: go to the wheel finder and choose your platform, flash-attn version, Python, PyTorch, and CUDA.
  2. 2) Copy the command: you’ll get commands like:
    pip install https://example.com/flash_attn-...whluv pip install https://example.com/flash_attn-...whl
  3. 3) Verify:
    python -c "import flash_attn; print('flash_attn ok')"

If something fails

Most install failures are version mismatches. Use these pages to recover quickly: