Skip to content

Support OpenAI's Consistency Decoder #5666

@Pirog17000

Description

@Pirog17000

Source - https://github.com/openai/consistencydecoder

OpenAI open-sourced consistency decoder, that decodes latent like a pro. Makes images more consistent and less disturbing
image
But. Right now I see no options to properly decode latent with their provided code. So, maybe there's some kind of solution already exist, but seems that it might be something you can add in diffusers as official support.
Weights of the Consistency Decoder are in .pt format. Possibly it can be converted into .safetensors and loaded as regular VAE.
Also, it is memory-heavy, meaning to decode image quick there's real need to unload the pipeline/model from VRAM at least to cpu

Here's my code which results into garbage-grade images (specifically out of SDXL pipeline):

import consistencydecoder
from diffusers import StableDiffusionXLPipeline
import torch
from consistencydecoder import ConsistencyDecoder

seed = 42
generator = torch.Generator(device="cuda").manual_seed(seed)

print(f"Loading pipe")
pipe = StableDiffusionXLPipeline.from_single_file(
    model_path, torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

pipe.vae.cuda()
decoder_consistency = ConsistencyDecoder(device="cuda:0")

image = pipe(prompt=prompt, width=1024, height=1024,
             num_inference_steps=20,
             generator=generator,
             output_type="latent",
             ).images

pipe.to("cpu")
print(f"Decoding latent")
consistencydecoder.save_image(image, "output.png")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions