Skip to content

FluxPipeline silently rounds the generated image shape #9904

@albertochimentiinbibo

Description

@albertochimentiinbibo

Describe the bug

When prompting the FluxPipeline class to generate an image with shape (1920, 1080), the output image shape is rounded to (1920, 1072) which to me seems like the nearest multiple of 16 instead of 8.
As the FluxPipeline class accepts input sizes divisible by 8 I would expect them to remain consistent throught the generation process.

By giving a quick look at the code it seems that in the FluxPipeline._unpack_latents method, the height and width are floor divided (//) by the vae_scale_factor which is 16.

I would love to understand why the scale factor is set like the following:
https://github.com/huggingface/diffusers/blob/89e4d6219805975bd7d253a267e1951badc9f1c0/src/diffusers/pipelines/flux/pipeline_flux.py#L197C9-L199C10

Reproduction

Here is the minimal code to reproduce the bug, feel free to change the number of inference steps as it should not influence the scope of the test.

from diffusers.pipelines import FluxPipeline
import torch

bf_repo = "black-forest-labs/FLUX.1-dev"

prompt = "Astronaut drinking coffe on the moon."
shape = (1920, 1080)

pipe = FluxPipeline.from_pretrained(bf_repo, torch_dtype=torch.bfloat16)

pipe.enable_sequential_cpu_offload()

image = pipe(
    prompt,
    height=shape[1],
    width=shape[0],
    num_inference_steps=28,
    generator=torch.Generator('cpu').manual_seed(123)
    ).images[0]

print(f"Prompted shape: {shape}")
print(f"Generated shape: {image.size}")
image.show()

Logs

Prompted shape: (1920, 1080)
Generated shape: (1920, 1072)

System Info

  • 🤗 Diffusers version: 0.31.0
  • Platform: Windows-10-10.0.22631-SP0
  • Running on Google Colab?: No
  • Python version: 3.10.6
  • PyTorch version (GPU?): 2.4.1+cu124 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.26.2
  • Transformers version: 4.44.2
  • Accelerate version: 0.34.2
  • PEFT version: not installed
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.5
  • xFormers version: not installed
  • Accelerator: NVIDIA GeForce RTX 4070 Laptop GPU, 8188 MiB
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

@sayakpaul @DN6

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions