-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Add LDM Super Resolution pipeline #1116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR @duongna21 and super cool addition! Tried the pipeline and it works super well already.
I left some comments, we need to address a few things before we can merge this.
Mainly
- handling
dtypeanddevice - handling different
schedulers - add doc page for the pipeline
- add tests for the
pipelineintests/pipelines/ldm_superresolition
Let me know if you need help with any of this. Great work!
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
patrickvonplaten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super cool addition - think we only need to solve some nits and then this is good to go :-)
…sion_superresolution.py Co-authored-by: Patrick von Platen <[email protected]>
…sion_superresolution.py Co-authored-by: Patrick von Platen <[email protected]>
…sion_superresolution.py Co-authored-by: Suraj Patil <[email protected]>
…sion_superresolution.py Co-authored-by: Suraj Patil <[email protected]>
|
@patil-suraj @patrickvonplaten Thanks you very much for the detailed comments. I learned a lot about the library when trying to address them. Please check out the fixes. |
|
Hey @duongna21 super cool! This PR also fixes typos in other pipelines. It would be best to open a separate PR for this and keep this PR only for super-resolution pipeline. It's better to have single purpose PR, so it's easy to test and review. Hope you understand :) |
@patil-suraj Sure, indeed we should do that. Unfixed the typo. |
tests/pipelines/latent_diffusion/test_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
|
|
||
| def preprocess(image): | ||
| w, h = image.size | ||
| w, h = map(lambda x: x - x % 32, (w, h)) # resize to integer multiple of 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? An alternative would be to pad and then crop the upscaled image. Not sure if it's worth it, slightly worried that this might skew images a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pcuenca This is how other pipelines resize the image so it can successfully forward over UNet (agree that it might skew the image). Really sorry I can't fully understand your suggestion, could you kindly push a commit for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the preprocessing should be similar to how it's done in the original repo, since the model is trained on the preprocessed image. @duongna21 could post a link to the original inference code ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patil-suraj Look at this and this. It works great with varying img size. But I can't spend time on this in the next few days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and no worries. We'll try to take a look at this, we can merge the PR without that also.
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
…solution.py Co-authored-by: Pedro Cuenca <[email protected]>
|
@pcuenca Thanks a lot for helpful suggestions. The tests look good now. |
patil-suraj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is looking good! Thank you for addressing the comments :)
Will run the slow tests and upload the checkpoint under CompVis org on the hub.
One last thing to verify is to check if the preprocessing code is similar to how it's done in the original repo.
Then this should be good to merge :)
|
|
||
| def preprocess(image): | ||
| w, h = image.size | ||
| w, h = map(lambda x: x - x % 32, (w, h)) # resize to integer multiple of 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the preprocessing should be similar to how it's done in the original repo, since the model is trained on the preprocessed image. @duongna21 could post a link to the original inference code ?
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py
Outdated
Show resolved
Hide resolved
tests/pipelines/latent_diffusion/test_latent_diffusion_superresolution.py
Show resolved
Hide resolved
…sion_superresolution.py Co-authored-by: Suraj Patil <[email protected]>
…sion_superresolution.py Co-authored-by: Suraj Patil <[email protected]>
…sion_superresolution.py Co-authored-by: Suraj Patil <[email protected]>
…sion_superresolution.py Co-authored-by: Suraj Patil <[email protected]>
|
Thanks a lot @duongna21 ! Uploaded the checkpoint under official account. https://huggingface.co/CompVis/ldm-super-resolution-4x-openimages |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice tests!
* Add ldm super resolution pipeline * style * fix copies * style * fix doc * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Patrick von Platen <[email protected]> * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Patrick von Platen <[email protected]> * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Suraj Patil <[email protected]> * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Suraj Patil <[email protected]> * add doc * address comments * address comments * fix doc * minor * add tests * add tests * load text encoder from subfolder * fix test * fix test * style * style * handle mps latents * unfix typo * unfix typo * Update tests/pipelines/latent_diffusion/test_latent_diffusion_superresolution.py Co-authored-by: Pedro Cuenca <[email protected]> * fix set_timesteps mps * fix set_timesteps mps * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Suraj Patil <[email protected]> * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Suraj Patil <[email protected]> * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Suraj Patil <[email protected]> * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py Co-authored-by: Suraj Patil <[email protected]> * style * test 64x64 instead of 256x256 Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>
4x Super Resolution by Latent Diffusion Model (original checkpoint here). Might fixes #463 and fixes #146.
How to use:
->
->
->
->
cc @patrickvonplaten @patil-suraj @pcuenca