Skip to content

Conversation

@prathikr
Copy link
Contributor

@prathikr prathikr commented Dec 6, 2022

No description provided.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 6, 2022

The documentation is not available anymore as the PR was closed or merged.

@prathikr prathikr changed the title integrate ort into textual-inversion and test-to-image examples integrate ort into textual-inversion and text-to-image examples Dec 12, 2022
@prathikr prathikr marked this pull request as ready for review December 21, 2022 21:38
@prathikr
Copy link
Contributor Author

@anton-l could I get a review on these scripts integrating ort with the remaining 2 stable diffusion tasks?

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks a lot for adding these examples! Would be nice to add a section in the readme, showing how to use this script along with installation instructions.

Comment on lines +631 to +644
accelerator.wait_for_everyone()
if accelerator.is_main_process:
unet = accelerator.unwrap_model(unet)
if args.use_ema:
ema_unet.copy_to(unet.parameters())

pipeline = StableDiffusionPipeline.from_pretrained(
args.pretrained_model_name_or_path,
text_encoder=text_encoder,
vae=vae,
unet=unet,
revision=args.revision,
)
pipeline.save_pretrained(args.output_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine to save the unet directly given it's wrapped in ORTModule?

subfolder="unet",
revision=args.revision,
)
unet = ORTModule(unet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a dump question: What model do we need to wrap during training, does need to be trainable or any module can be used ? Because in textual inversion the unet is not trained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patil-suraj interesting, I did not catch this. Is it the text_encoder that gets trained? Is that a BERT based model or some other architecture?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in textual inversion only the text embedding layer is trained not even the full-text encoder. And it's a CLIPTextModel.

@patil-suraj patil-suraj requested a review from anton-l December 22, 2022 13:42
@prathikr prathikr closed this Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants