Fix AsymmetricAutoencoderKL forward #8378

townwish4git · 2024-06-02T15:22:12Z

What does this PR do?

Fixes # (issue)
Fix incorrect call to self.decode() within AsymmetricAutoencoderKL.forward():

- dec = self.decode(z, sample, mask).sample
+ dec = self.decode(z, generator, sample, mask).sample

related to: issue#8317

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul @yiyixuxu @DN6

tolgacangoz · 2024-06-02T19:58:57Z

Thanks for opening this PR!
I wonder what generator is used for here?

townwish4git · 2024-06-03T02:05:36Z

Thanks for opening this PR! I wonder what generator is used for here?

The generator used here is the one passed into the AsymmetricAutoencoderKL.forward()

def forward(
        self,
        sample: torch.Tensor,
        mask: Optional[torch.Tensor] = None,
        sample_posterior: bool = False,
        return_dict: bool = True,
        generator: Optional[torch.Generator] = None,
    ) -> Union[DecoderOutput, Tuple[torch.Tensor]]:
        ...
-        dec = self.decode(z, sample, mask).sample
+        dec = self.decode(z, generator, sample, mask).sample
        ...

tolgacangoz · 2024-06-03T06:30:25Z

But decode() function doesn't use it 🤔.

sayakpaul · 2024-06-03T07:49:38Z

src/diffusers/models/autoencoders/autoencoder_asym_kl.py

        else:
            z = posterior.mode()
-        dec = self.decode(z, sample, mask).sample
+        dec = self.decode(z, generator, sample, mask).sample


Tend to agree with @tolgacangoz's comments here. It's not used in the decode() function. Similar to AutoencoderKL. It's used in the forward():

diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py

Line 445 in 4136044

z = posterior.sample(generator=generator)

I don't think there's a stochastic component in the decode() function in the first place. So, I further think there's no need to have generator in here too:

diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py

Line 286 in 4136044

def decode(self, z: torch.Tensor, return_dict: bool = True, generator=None) -> Union[DecoderOutput, torch.Tensor]:

@yiyixuxu WDYT here?

i think this comment answered the question:
https://github.com/huggingface/diffusers/pull/5694/files#r1387349283

@sayakpaul I had the same questions, but their explanation makes sense I think

townwish4git · 2024-06-03T07:54:56Z

But decode() function doesn't use it 🤔.

I think this is to maintain consistency of decode() interfaces across different VAEs. For example, here is a demo using AsymmetricAutoencoderKL:

...
from diffusers import AsymmetricAutoencoderKL, StableDiffusionInpaintPipeline
...
pipe = StableDiffusionInpaintPipeline.from_pretrained("runwayml/stable-diffusion-inpainting")
pipe.vae = AsymmetricAutoencoderKL.from_pretrained("cross-attention/asymmetric-autoencoder-kl-x-1-5")
pipe.to("cuda")

image = pipe(prompt=prompt, image=image, mask_image=mask_image).images[0]
image.save("image.jpeg")

in this case, when origin vae is replaced with AsymmetricAutoencoderKL, you don't have to modify the codes that calls self.vae.decode() within the pipeline, and you can simply retain the original code where arguments generator is passed in:

image = self.vae.decode(
    latents / self.vae.config.scaling_factor, return_dict=False, generator=generator, **condition_kwargs
)[0]

tolgacangoz · 2024-06-03T08:48:29Z

Hmm, you are right. The generator was added to keep interfaces the same as the Consistency Decoder.

yiyixuxu

thanks!

HuggingFaceDocBuilderDev · 2024-06-03T17:47:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu · 2024-06-03T18:28:46Z

@tolgacangoz @sayakpaul
I want to understand a little bit more before merging this PR
with this demo #8378 (comment) I think it will currently throw an error so this PR fixes it

is there a different way to use AsymmetricAutoencoderKL? asking because we updated the signature and can potentially break the previous use case that's working. If it is previously not working, we are good

townwish4git · 2024-06-04T01:51:35Z

@yiyixuxu Most use cases call the AsymmetricAutoencoderKL.encode() or decode() functions, which will be not affected before and after merging.

I'm not sure if there is any forward() called directly in the current use cases, and if it exist:

Before merging: it wouldn't throw an error, but would have unexpected computed results as issue#8317 mentioned
After merging: get fixed

Fix AsymmetricAutoencoderKL forward

83ab267

townwish4git mentioned this pull request Jun 2, 2024

AsymmetricAutoencoderKL: missing generator argument in decode() called from forward() #8317

Closed

sayakpaul reviewed Jun 3, 2024

View reviewed changes

yiyixuxu approved these changes Jun 3, 2024

View reviewed changes

yiyixuxu merged commit 6be43bd into huggingface:main Jun 4, 2024

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

Fix AsymmetricAutoencoderKL forward (#8378)

c0a8119

Fix AsymmetricAutoencoderKL forward #8378

Fix AsymmetricAutoencoderKL forward #8378

Uh oh!

Conversation

townwish4git commented Jun 2, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

tolgacangoz commented Jun 2, 2024

Uh oh!

townwish4git commented Jun 3, 2024

Uh oh!

tolgacangoz commented Jun 3, 2024

Uh oh!

sayakpaul Jun 3, 2024

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jun 3, 2024

Choose a reason for hiding this comment

Uh oh!

townwish4git Jun 3, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Jun 3, 2024

Choose a reason for hiding this comment

Uh oh!

townwish4git commented Jun 3, 2024

Uh oh!

tolgacangoz commented Jun 3, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 3, 2024

Uh oh!

yiyixuxu commented Jun 3, 2024

Uh oh!

townwish4git commented Jun 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants