Skip to content

Conversation

@MasterJH5574
Copy link
Contributor

This PR fixes a few kernel dispatch issues due to the recent introduction of mha_sliding as a new attention kind.

Tested on Qwen3 1.7B with MLC-LLM.

This PR fixes a few kernel dispatch issues due to the recent
introduction of `mha_sliding` as a new attention kind.

Tested on Qwen3 1.7B with MLC-LLM.
@yongwww yongwww merged commit d9f0838 into apache:main Jul 7, 2025
13 checks passed
ShiboXing pushed a commit to ShiboXing/tvm that referenced this pull request Aug 10, 2025
* [KVCache] Fix kernel dispatch based on attention kinds

This PR fixes a few kernel dispatch issues due to the recent
introduction of `mha_sliding` as a new attention kind.

Tested on Qwen3 1.7B with MLC-LLM.

* Fix lint

---------

Co-authored-by: Yong Wu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants