[Reproduceability 1/3] Allow tensors to be generated on CPU #1902

patrickvonplaten · 2023-01-03T13:37:44Z

This PR adds a new helper function torch_randn that has three purposes:

Allow fully deterministic generation by creating the tensors on CPU as also demanded here: Use consistent random number generation across hardware #1514
Avoids lots of duplicated code, as the same 5,6 lines are implemented in 10,20+ pipeline files.
Helps us to better tests models by always creating tensors on CPU

This is the first PR of a two PR series to see if this helps to make UnCLIP deterministic across different CUDA, GPUs. In a follow-up PR, I'd then like to replace all existing torch.randn functions with this function as well as add a reproducibility.md guide.

HuggingFaceDocBuilderDev · 2023-01-03T13:43:14Z

The documentation is not available anymore as the PR was closed or merged.

anton-l · 2023-01-03T15:05:37Z

scripts/convert_kakao_brain_unclip_to_diffusers.py


+
 # <original>.time_embed -> <diffusers>.time_embedding


New black? 🤔

Yeah, updated my black and then re-updated back to 22.8. We should maybe soon blackify the complete codebase once :-)

src/diffusers/utils/torch_utils.py

Co-authored-by: Anton Lozhkov <[email protected]>

tests/pipelines/unclip/test_unclip.py

…users into add_torch_randn

pcuenca

Haven't tested it yet but it looks great!

My only small concern is about the name, wrote a comment below.

pcuenca · 2023-01-03T15:16:11Z

src/diffusers/pipelines/unclip/pipeline_unclip.py

-                latents = torch.randn(shape, generator=generator, device="cpu", dtype=dtype).to(device)
-            else:
-                latents = torch.randn(shape, generator=generator, device=device, dtype=dtype)
+            latents = torch_randn(shape, generator=generator, device=device, dtype=dtype)


src/diffusers/utils/torch_utils.py

pcuenca · 2023-01-03T15:26:16Z

src/diffusers/utils/torch_utils.py

+        latents = [
+            torch.randn(shape, generator=generator[i], device=rand_device, dtype=dtype) for i in range(batch_size)
+        ]
+        latents = torch.cat(latents, dim=0).to(device)


Very cool to include per-item reproducibility too

pcuenca · 2023-01-03T15:29:50Z

src/diffusers/utils/torch_utils.py

+logger = logging.get_logger(__name__)  # pylint: disable=invalid-name
+
+
+def torch_randn(


My only concern here is that torch_randn is easy to confuse (both visually and inadvertently while typing) with torch.randn. Would it make sense to make the name slightly more different? Can't think of anything great though, diffusers_randn feels kind of ugly.

Maybe just randn_tensor?

Co-authored-by: Pedro Cuenca <[email protected]>

anton-l · 2023-01-03T16:30:10Z

src/diffusers/utils/torch_utils.py

+                    f" Tensors will be created on 'cpu' and then moved to {device}. Note that one can probably"
+                    f" slighly speed up this function by passing a generator that was created on the {device} device."
+                )
+        elif generator.device.type != device.type and generator.device.type == "cuda":


I would add a comment here that we only allow cpu->cuda generation for reproducibility reasons, which is why the other way around is not supported / doesn't make sense.

Yes, I'll make a whole doc page about this in the follow-up PR :-)

…gingface#1902) * [Deterministic torch randn] Allow tensors to be generated on CPU * fix more * up * fix more * up * Update src/diffusers/utils/torch_utils.py Co-authored-by: Anton Lozhkov <[email protected]> * Apply suggestions from code review * up * up * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>

[Deterministic torch randn] Allow tensors to be generated on CPU

70e3de6

patrickvonplaten added 4 commits January 3, 2023 13:54

fix more

b2d85ea

up

63353bf

fix more

a639448

up

80ba55a

anton-l reviewed Jan 3, 2023

View reviewed changes

src/diffusers/utils/torch_utils.py Outdated Show resolved Hide resolved

Update src/diffusers/utils/torch_utils.py

1479424

Co-authored-by: Anton Lozhkov <[email protected]>

patrickvonplaten commented Jan 3, 2023

View reviewed changes

tests/pipelines/unclip/test_unclip.py Outdated Show resolved Hide resolved

patrickvonplaten added 4 commits January 3, 2023 16:09

Apply suggestions from code review

f6db58b

up

c7a69e7

Merge branch 'add_torch_randn' of https://github.com/huggingface/diff…

4ec71b0

…users into add_torch_randn

up

b535251

pcuenca approved these changes Jan 3, 2023

View reviewed changes

Apply suggestions from code review

575b74c

Co-authored-by: Pedro Cuenca <[email protected]>

anton-l reviewed Jan 3, 2023

View reviewed changes

patrickvonplaten merged commit 8ed08e4 into main Jan 3, 2023

patrickvonplaten deleted the add_torch_randn branch January 3, 2023 17:26

patrickvonplaten changed the title ~~[Deterministic torch randn] Allow tensors to be generated on CPU~~ [Reproduceability 1/3] Allow tensors to be generated on CPU Jan 5, 2023

patrickvonplaten mentioned this pull request Jan 5, 2023

Improve reproduceability 2/3 #1906

Merged

		logger = logging.get_logger(__name__) # pylint: disable=invalid-name


		def torch_randn(

[Reproduceability 1/3] Allow tensors to be generated on CPU #1902

[Reproduceability 1/3] Allow tensors to be generated on CPU #1902

Uh oh!

Conversation

patrickvonplaten commented Jan 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jan 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

patrickvonplaten commented Jan 3, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 3, 2023 •

edited

Loading