how to compare mamba with flashattention2

In your paper, you mentioned that mamba scan is faster than flashattention2.  
Does it mean comparing https://github.com/state-spaces/mamba/blob/0131c1e94a46fc9f70bcfc9d57962963bb2f0b9e/mamba_ssm/ops/selective_scan_interface.py#L14 with https://github.com/Dao-AILab/flash-attention/blob/9356a1c0389660d7e231ff3163c1ac17d9e3824a/flash_attn/flash_attn_interface.py#L432 ? 
The inputs of these two modules are different, is this comparation fair? Or the preprocessing(compute q, k, v in flashattention; compute A,B,C,D,delta in mamba scan) need to be be taken into account?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

how to compare mamba with flashattention2 #27

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

how to compare mamba with flashattention2 #27

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions