Skip to content

Conversation

@duongna21
Copy link
Contributor

@duongna21 duongna21 commented Nov 3, 2022

4x Super Resolution by Latent Diffusion Model (original checkpoint here). Might fixes #463 and fixes #146.

How to use:

pip install git+https://github.com/duongna21/diffusers.git@add-sr-pipeline
from diffusers import LDMSuperResolutionPipeline
from PIL import Image

pipe = LDMSuperResolutionPipeline.from_pretrained('duongna/ldm-super-resolution')
pipe.to('cuda')

img = Image.open('low_resolution.jpg')
super_img = pipe(img, num_inference_steps=100, eta=1)
super_img['images'][0]

ảnh
->
ảnh

ảnh
->
ảnh

ảnh
->
ảnh

ảnh
->
ảnh

cc @patrickvonplaten @patil-suraj @pcuenca

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 3, 2022

The documentation is not available anymore as the PR was closed or merged.

@duongna21 duongna21 changed the title Add Super Resolution pipeline Add LDM Super Resolution pipeline Nov 3, 2022
@patil-suraj patil-suraj self-assigned this Nov 3, 2022
Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR @duongna21 and super cool addition! Tried the pipeline and it works super well already.

I left some comments, we need to address a few things before we can merge this.
Mainly

  • handling dtype and device
  • handling different schedulers
  • add doc page for the pipeline
  • add tests for the pipeline in tests/pipelines/ldm_superresolition

Let me know if you need help with any of this. Great work!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool addition - think we only need to solve some nits and then this is good to go :-)

@duongna21
Copy link
Contributor Author

duongna21 commented Nov 6, 2022

@patil-suraj @patrickvonplaten Thanks you very much for the detailed comments. I learned a lot about the library when trying to address them. Please check out the fixes.
Also, can you help me fix the test at tests/pipelines/latent_diffusion/test_latent_diffusion_superresolution.py?

@patil-suraj
Copy link
Contributor

Hey @duongna21 super cool! This PR also fixes typos in other pipelines. It would be best to open a separate PR for this and keep this PR only for super-resolution pipeline. It's better to have single purpose PR, so it's easy to test and review. Hope you understand :)

@duongna21
Copy link
Contributor Author

It would be best to open a separate PR for this and keep this PR only for super-resolution pipeline.

@patil-suraj Sure, indeed we should do that. Unfixed the typo.


def preprocess(image):
w, h = image.size
w, h = map(lambda x: x - x % 32, (w, h)) # resize to integer multiple of 32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? An alternative would be to pad and then crop the upscaled image. Not sure if it's worth it, slightly worried that this might skew images a little bit.

Copy link
Contributor Author

@duongna21 duongna21 Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcuenca This is how other pipelines resize the image so it can successfully forward over UNet (agree that it might skew the image). Really sorry I can't fully understand your suggestion, could you kindly push a commit for it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the preprocessing should be similar to how it's done in the original repo, since the model is trained on the preprocessed image. @duongna21 could post a link to the original inference code ?

Copy link
Contributor Author

@duongna21 duongna21 Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patil-suraj Look at this and this. It works great with varying img size. But I can't spend time on this in the next few days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and no worries. We'll try to take a look at this, we can merge the PR without that also.

@duongna21
Copy link
Contributor Author

@pcuenca Thanks a lot for helpful suggestions. The tests look good now.

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is looking good! Thank you for addressing the comments :)

Will run the slow tests and upload the checkpoint under CompVis org on the hub.

One last thing to verify is to check if the preprocessing code is similar to how it's done in the original repo.

Then this should be good to merge :)


def preprocess(image):
w, h = image.size
w, h = map(lambda x: x - x % 32, (w, h)) # resize to integer multiple of 32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the preprocessing should be similar to how it's done in the original repo, since the model is trained on the preprocessed image. @duongna21 could post a link to the original inference code ?

@patil-suraj
Copy link
Contributor

Thanks a lot @duongna21 ! Uploaded the checkpoint under official account.

https://huggingface.co/CompVis/ldm-super-resolution-4x-openimages

@patil-suraj patil-suraj merged commit 5a59f9b into huggingface:main Nov 9, 2022
# See the License for the specific language governing permissions and
# limitations under the License.

import random
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice tests!

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* Add ldm super resolution pipeline

* style

* fix copies

* style

* fix doc

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Suraj Patil <[email protected]>

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Suraj Patil <[email protected]>

* add doc

* address comments

* address comments

* fix doc

* minor

* add tests

* add tests

* load text encoder from subfolder

* fix test

* fix test

* style

* style

* handle mps latents

* unfix typo

* unfix typo

* Update tests/pipelines/latent_diffusion/test_latent_diffusion_superresolution.py

Co-authored-by: Pedro Cuenca <[email protected]>

* fix set_timesteps mps

* fix set_timesteps mps

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Suraj Patil <[email protected]>

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Suraj Patil <[email protected]>

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Suraj Patil <[email protected]>

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion_superresolution.py

Co-authored-by: Suraj Patil <[email protected]>

* style

* test 64x64 instead of 256x256

Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Suraj Patil <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Super Resolution Diffusion Model Wishlist: example for training a superresolution pipeline

5 participants