Skip to content

Conversation

@nifleisch
Copy link
Collaborator

Description

This PR adds the PAB and FasterCache algorithms from diffusers (https://huggingface.co/docs/diffusers/main/api/cache).

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I manually tested both caching mechanisms across all supported Diffusers pipelines, visually inspected the resulting images, videos, and measured the inference time (relative speedups match this benchmark. For FLUX, I evaluated every supported combination of algorithms. I implemented new tests for each algorithm and all of their combinations with other algorithms.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

I tried to make these cachers work with compilation but ran into errors. The main problem is that these methods introduce a condition for the attention layer, which hinders compilation.

@nifleisch nifleisch changed the title Feat/add fastercache and pab feat: add fastercache and pab May 6, 2025
Copy link
Member

@johnrachwan123 johnrachwan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@sharpenb sharpenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Mostly small comments before ti can be merged.

Copy link
Member

@sharpenb sharpenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I readded my comments that have been dropped in the last review ;)

"""
imported_modules = self.import_algorithm_packages()
# set default values
temporal_attention_block_skip_range: Optional[int] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still recommend to put them in the smash config as constant and mention that these can be overwritten for different architecture with a link to the code file or the diffuser PR so that the documentation is complete.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The best solution would be to use the pipeline-specific defaults unless the user explicitly specifies a parameter. Unfortunately, our SmashConfig interface doesn’t support that logic right now. To implement it, we’d have to apply the same defaults across every pipeline. Given the large number of parameters—and the fact that most aren’t straightforward to tune—I’ll leave things as they are for now. This approach ensures users receive strong out-of-the-box results and a straightforward interface, albeit with fewer tuning options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. As discussed async, let's make clear in the PR descripiton that we have a pipeline specific default in this PR that would need some iterations :)

Comment on lines +117 to +132
temporal_attention_block_skip_range: Optional[int] = None
spatial_attention_timestep_skip_range: Tuple[int, int] = (-1, 681)
temporal_attention_timestep_skip_range: Optional[Tuple[int, int]] = None
low_frequency_weight_update_timestep_range: Tuple[int, int] = (99, 901)
high_frequency_weight_update_timestep_range: Tuple[int, int] = (-1, 301)
unconditional_batch_skip_range: int = 5
unconditional_batch_timestep_skip_range: Tuple[int, int] = (-1, 641)
spatial_attention_block_identifiers: Tuple[str, ...] = (
"blocks.*attn1",
"transformer_blocks.*attn1",
"single_transformer_blocks.*attn1"
)
temporal_attention_block_identifiers: Tuple[str, ...] = ("temporal_transformer_blocks.*attn1",)
attention_weight_callback = lambda _: 0.5 # noqa: E731
tensor_format: str = "BFCHW"
is_guidance_distilled: bool = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still recommend to put them in the smash config as constant and mention that these can be overwritten for different architecture with a link to the code file or the diffuser PR so that the documentation is complete :)

@nifleisch nifleisch requested a review from sharpenb May 12, 2025 15:24
Copy link
Member

@sharpenb sharpenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go!

@nifleisch nifleisch merged commit 4827f66 into main May 12, 2025
7 checks passed
@nifleisch nifleisch deleted the feat/add-fastercache-and-pab branch May 12, 2025 16:37
davidberenstein1957 pushed a commit that referenced this pull request May 13, 2025
* fix: correct docstring in deepcache

* feat: add model checks

* feat: add pyramid attention broadcast (pab) cacher

* feat: add fastercache cacher

* tests: add flux tiny random fixture

* tests: add algorithms tests for pab and fastercache

* tests: add combination tests for pab and fastercache

* fix: add 1 as value for interval parameter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants