Skip to content

Conversation

@rocking5566
Copy link
Collaborator

Proposed changes

Implement dynamic quant for fmha to replace elder static quant.
Now, user need to pass descale of qkv for dynamic quant

Checklist

./bin/tile_example_fmha_fwd -warmup=0 -repeat=1 -init=3 -b=1 -s=128 -h=1 -d=128 -prec=fp8bf16 -qscale=1

@rocking5566 rocking5566 force-pushed the rocking/fmha-fp8-pertensor branch from 02dac2c to ac204da Compare November 13, 2025 20:26
@rocking5566 rocking5566 removed the WIP label Nov 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants