-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Preliminar support for sageattention3 #9047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Not sure what are the implications of this, but it seems to make sage3 actually run,
I don't feel like filling out approval forms so I'll wait until it's public before testing this. For the line ending check if you sync master it should be fixed. |
@Panchovix I agree with your last point: it's probably better to separate sageattention and sageattention3 into different flags. as they have different signatures and expect different tensor shapes as well. |
Okay I have separated the sage attention versions with different flags:
Now they work separately (so you can have both installed in the venv and use the one you want). I also still get less performance on SDXL for some reason. But I'm not sure if my implementation is not correct and that may cause that perf regression. Kijai seems to have better performance on video, as he mentioned on https://huggingface.co/jt-zhang/SageAttention3/discussions/3#6883b71543d2651d281c9cc0, but it seems to be a custom implementation from here kijai/ComfyUI-WanVideoWrapper@a35eb7d. Any ideas are welcome. |
Add preliminary support to sage attention 3 (https://github.com/thu-ml/SageAttention and https://huggingface.co/jt-zhang/SageAttention3) for attention.
Help in implementation is welcome, as when testing on txt2img and img2img on SDXL on a RTX 5090, on Linux, I'm getting ~10% slower speeds vs sage attention 2, so probably I'm missing something here. EDIT: Thanks to Kijai I have updated per_block_mean to False and it should make a difference.
If someone can test with video it would be great.
Also maybe it would be a good idea to have different flags on comfy.model_management and comfy.cli_args for the 2 different versions?