Skip to content

Conversation

@hlky
Copy link
Contributor

@hlky hlky commented Dec 12, 2024

What does this PR do?

#10131 (comment)

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu @vladmandic

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@vladmandic
Copy link
Contributor

looks great, thanks!

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok but I actually like the ControlNetUnionInput, it is pretty difficult to use otherwise having to figure out which mode is which number for given checkpoint

@hlky
Copy link
Contributor Author

hlky commented Dec 12, 2024

Yeah I like it too 🤷‍♂️ seems like integration in some ui is the same whether it's mapping the names in the class or the indices, and we know that the class should be used because it's the specific pipeline.

@yiyixuxu yiyixuxu merged commit e8b65bf into huggingface:main Dec 12, 2024
12 checks passed
@vladmandic
Copy link
Contributor

vladmandic commented Dec 13, 2024

tiny, but breaking issue in:

    _callback_tensor_inputs = [
        "latents",
        "prompt_embeds",
        "negative_prompt_embeds",
        "add_text_embeds",
        "add_time_ids",
        "negative_pooled_prompt_embeds",
        "negative_add_time_ids",
        "image",
    ]

but image doesnt exist anymore - remove image from callback list or there is an instant runtime error.

and one oversight - we have new txt/img/inpaint pipelines, but no autopipeline mapping to connect them.

@hlky
Copy link
Contributor Author

hlky commented Dec 13, 2024

I tested this with the same examples as before, I've just ran it again and it works, is there something specific you're doing to get "an instant runtime error"? would you mind sharing all the relevant information?

@vladmandic
Copy link
Contributor

vladmandic commented Dec 13, 2024

if you have a callback defined, then you get a runtime error since there is no image param.
your tests do not run with callback. only thing that is needed is to remove image from callback params listed above. you can replace it with control_image if you want, but its pointless since its not processed anymore at that time.

actual error:

│ /home/vlado/dev/sdnext/venv/lib/python3.12/site-packages/diffusers/pipelines/controlnet/pipeline_controlnet_union_sd_xl.py:1449 in __call__                                                                                                                                                                                                                                                                                      │
│                                                                                                                                                                                                                                                                                                                                                                                                                                  │
│   1448 │   │   │   │   │   for k in callback_on_step_end_tensor_inputs:                                                                                                                                                                                                                                                                                                                                                          │
│ ❱ 1449 │   │   │   │   │   │   callback_kwargs[k] = locals()[k]                                                                                                                                                                                                                                                                                                                                                                  │
│   1450 │   │   │   │   │   callback_outputs = callback_on_step_end(self, i, t, callback_kwargs)                                                                                                                                                                                                                                                                                                                                  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'image'

@hlky
Copy link
Contributor Author

hlky commented Dec 13, 2024

#10218

@vladmandic
Copy link
Contributor

vladmandic commented Dec 13, 2024

yup, thats exactly it - thanks!

will you also add mappings in pipelines/auto_pipeline.py to link new txt/img/inpaint pipelines?
and what are the actual values for control_mode? i'm guessing its
- 0: openpose
- 1: depth
- 2: hed/pidi/scribble/ted
- 3: canny/lineart/anime_lineart/mlsd
- 4: normal
- 5: segment
- 6: tile
- 7: repaint

@hlky
Copy link
Contributor Author

hlky commented Dec 14, 2024

and what are the actual values for control_mode? i'm guessing its

Yes, this is why we had the dataclass btw.

@vladmandic
Copy link
Contributor

in general, i like the concept of dataclass, just overriding image when its not-an-image was messing things badly.
anyhow, i think we're really close, just need the autopipeline mappings.

@hlky
Copy link
Contributor Author

hlky commented Dec 14, 2024

just overriding image when its not-an-image was messing things badly.

So we can understand the effect downstream could you link to the code that didn't work when it's not an image?

#10219 for autopipeline, only for from_pretrained because it's very late, let me know how much of a priority from_pipe is.

@vladmandic
Copy link
Contributor

i've tested with #10218 + #10219 and seems ok.

fyi, typical sdnext workflow is to:

  • load a model
  • switch a pipeline in runtime depending on user inputs.
    e.g., if user provided input image, we'll switch to img2img. if mask is provided, we'll switch to inpaint. otherwise we'll run in txt2img.
  • inspect pipeline for available args and set them as appropriate.
    using inspect.signature(cls.__call__, follow_wrapped=True, eval_str=True)
    e.g. if prompt_embeds are allowed as param, we'll try to pre-calcululate them, otherwise we'll pass prompt as-is.
    thus overloading any known param such as image that changes its meaning would be a nightmare.
    if image param is available, its going to be handled. but it cannot be a PIL.Image in one pipeline and a DataClass in another.

@vladmandic
Copy link
Contributor

vladmandic commented Dec 14, 2024

one more issue - num_control_type is hard coded as 6 in
(

)

but its not the same number of Union and ProMax - even in your example, you used tile mode. but we cant specify control_mode=6 manually (which should be tile by looking at the code) since its larger than allowed num_control_type.

│ /home/vlado/dev/sdnext/venv/lib/python3.12/site-packages/diffusers/pipelines/controlnet/pipeline_controlnet_union_sd_xl_img2img.py:1309 in __call__                                                                                                                                                                                                                                                                              │
│                                                                                                                                                                                                                                                                                                                                                                                                                                  │
│   1308 │   │   for _image, control_idx in zip(control_image, control_mode):                                                                                                                                                                                                                                                                                                                                                      │
│ ❱ 1309 │   │   │   control_type[control_idx] = 1                                                                                                                                                                                                                                                                                                                                                                                 │
│   1310 │   │   │   self.check_inputs(                                                                                                                                                                                                                                                                                                                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
IndexError: list assignment index out of range

btw, it also shows that my question from yesterday - what exactly are valid control_mode values for both Union and ProMax is not that obvious - please document.

@hlky
Copy link
Contributor Author

hlky commented Dec 14, 2024

How are you loading ProMax? Try using brad-twinkl/controlnet-union-sdxl-1.0-promax or OzzyGT/controlnet-union-promax-sdxl-1.0, these are uploaded correctly with the right config. If you're loading diffusion_pytorch_model_promax.safetensors from the original repo the config will be wrong, should work if you pass num_control_type=8 to from_pretrained.

@vladmandic
Copy link
Contributor

right - yes, that has correct config, ignore the last msg.

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants