support bf16 for stable diffusion #792

patil-suraj · 2022-10-10T10:38:14Z

Currently stable diffusion (or diffusers in general) doesn't work with bf16 as nearest upsampling in torch is not supported in bf16.

Minimal code to reproduce:

import torch
import torch.nn.functional as F

image = torch.randn(1, 4, 32, 32).to(device="cuda", dtype=torch.bfloat16)
out = F.interpolate(image, size=(64, 64), mode="nearest")

Addinatly, in pipelines we need to cast the images to fp32 as bf16 is not yet supported in numpy.

This is a draft PR to enable bf16 training/inference for stable diffusion by casting input to fp32 where bf16 is not supported.

Not sure if this is the right way, curious to you hear your feedback @patrickvonplaten @NouamaneTazi

fixes #771

HuggingFaceDocBuilderDev · 2022-10-10T10:41:23Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2022-10-10T15:55:12Z

src/diffusers/pipelines/stable_diffusion/safety_checker.py

+        cos_dist = cosine_distance(image_embeds, self.concept_embeds).cpu()
+
+        # cast to float32 to as numpy does not support bfloat16
+        if image_embeds.dtype == torch.bfloat16:


I don't think we need an if statement here as .float() should always work and be correct (e.g. we can't do fp16 on CPU either and doing numpy() will move it to fp32 anyways

@patrickvonplaten why would numpy() move the arrays to fp32? I thought that numpy arrays support fp16?
I believe we should use .half() instead which should work on CPU also?

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

src/diffusers/models/resnet.py

NouamaneTazi

Unfortunately some of these precision modifications will add more unrolled_elementwise_kernel< direct_copy_kernel_cuda> kernels, but at least doesnt require CPU-GPU sync, and doesn't happen inside a loop. So LGTM :-)

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

NouamaneTazi · 2022-10-10T19:56:37Z

src/diffusers/pipelines/stable_diffusion/safety_checker.py

+        cos_dist = cosine_distance(image_embeds, self.concept_embeds).cpu()
+
+        # cast to float32 to as numpy does not support bfloat16
+        if image_embeds.dtype == torch.bfloat16:


@patrickvonplaten why would numpy() move the arrays to fp32? I thought that numpy arrays support fp16?
I believe we should use .half() instead which should work on CPU also?

* support bf16 for stable diffusion * fix typo * address review comments

support bf16 for stable diffusion

2c86a94

patil-suraj marked this pull request as draft October 10, 2022 10:43

patil-suraj requested review from NouamaneTazi, anton-l, patrickvonplaten and pcuenca October 10, 2022 10:43

patil-suraj mentioned this pull request Oct 10, 2022

BF16 doesn't work with dreambooth #771

Closed

fix typo

90385bb

patrickvonplaten reviewed Oct 10, 2022

View reviewed changes

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py Outdated Show resolved Hide resolved

patrickvonplaten reviewed Oct 10, 2022

View reviewed changes

src/diffusers/models/resnet.py Show resolved Hide resolved

NouamaneTazi approved these changes Oct 10, 2022

View reviewed changes

patil-suraj mentioned this pull request Oct 11, 2022

"upsample_nearest2d_out_frame" not implemented for 'BFloat16' pytorch/pytorch#86679

Closed

address review comments

880f9f9

patil-suraj marked this pull request as ready for review October 11, 2022 09:34

patil-suraj requested a review from patrickvonplaten October 11, 2022 09:45

patrickvonplaten approved these changes Oct 11, 2022

View reviewed changes

patil-suraj merged commit 797b290 into main Oct 11, 2022

patil-suraj deleted the sd-bf16 branch October 11, 2022 10:02

patil-suraj mentioned this pull request Oct 11, 2022

stable diffusion fine-tuning #356

Merged

prathikr pushed a commit to prathikr/diffusers that referenced this pull request Oct 26, 2022

support bf16 for stable diffusion (huggingface#792)

57aacd4

* support bf16 for stable diffusion * fix typo * address review comments

PhaneeshB pushed a commit to nod-ai/diffusers that referenced this pull request Mar 1, 2023

Fix tuned model selection for non-vulkan devices (huggingface#792)

9570045

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023

support bf16 for stable diffusion (huggingface#792)

75abd07

* support bf16 for stable diffusion * fix typo * address review comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support bf16 for stable diffusion #792

support bf16 for stable diffusion #792

Uh oh!

patil-suraj commented Oct 10, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Oct 10, 2022 •

edited

Loading

Uh oh!

patrickvonplaten Oct 10, 2022

Uh oh!

NouamaneTazi Oct 10, 2022

Uh oh!

Uh oh!

Uh oh!

NouamaneTazi left a comment

Uh oh!

Uh oh!

NouamaneTazi Oct 10, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

support bf16 for stable diffusion #792

support bf16 for stable diffusion #792

Uh oh!

Conversation

patil-suraj commented Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten Oct 10, 2022

Choose a reason for hiding this comment

Uh oh!

NouamaneTazi Oct 10, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

NouamaneTazi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NouamaneTazi Oct 10, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

patil-suraj commented Oct 10, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 10, 2022 •

edited

Loading