Skip to content

Commit 2ff27f7

Browse files
authored
Adding support for .bin files from huggingface concepts (invoke-ai#498)
* Adding support for .bin files from huggingface concepts * Updating documentation to include huggingface .bin info
1 parent 9e53dae commit 2ff27f7

File tree

2 files changed

+30
-9
lines changed

2 files changed

+30
-9
lines changed

docs/features/TEXTUAL_INVERSION.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# **Personalizing Text-to-Image Generation**
22

3-
You may personalize the generated images to provide your own styles or objects by training a new LDM checkpoint and introducing a new vocabulary to the fixed model.
3+
You may personalize the generated images to provide your own styles or objects by training a new LDM checkpoint and introducing a new vocabulary to the fixed model as a (.pt) embeddings file. Alternatively, you may use or train HuggingFace Concepts embeddings files (.bin) from https://huggingface.co/sd-concepts-library and its associated notebooks.
4+
5+
**Training**
46

57
To train, prepare a folder that contains images sized at 512x512 and execute the following:
68

@@ -26,9 +28,11 @@ On a RTX3090, the process for SD will take ~1h @1.6 iterations/sec.
2628

2729
_Note_: According to the associated paper, the optimal number of images is 3-5. Your model may not converge if you use more images than that.
2830

29-
Training will run indefinately, but you may wish to stop it before the heat death of the universe, when you find a low loss epoch or around ~5000 iterations.
31+
Training will run indefinitely, but you may wish to stop it before the heat death of the universe, when you find a low loss epoch or around ~5000 iterations.
32+
33+
**Running**
3034

31-
Once the model is trained, specify the trained .pt file when starting dream using
35+
Once the model is trained, specify the trained .pt or .bin file when starting dream using
3236

3337
```
3438
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py --embedding_path /path/to/embedding.pt --full_precision
@@ -46,7 +50,7 @@ This also works with image2image
4650
dream> "waterfall and rainbow in the style of *" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
4751
```
4852

49-
It's also possible to train multiple token (modify the placeholder string in `configs/stable-diffusion/v1-finetune.yaml`) and combine LDM checkpoints using:
53+
For .pt files it's also possible to train multiple tokens (modify the placeholder string in `configs/stable-diffusion/v1-finetune.yaml`) and combine LDM checkpoints using:
5054

5155
```
5256
(ldm) ~/stable-diffusion$ python3 ./scripts/merge_embeddings.py \

ldm/modules/embedding_manager.py

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ def get_clip_token_for_string(tokenizer, string):
2424
return_tensors='pt',
2525
)
2626
tokens = batch_encoding['input_ids']
27-
assert (
27+
""" assert (
2828
torch.count_nonzero(tokens - 49407) == 2
29-
), f"String '{string}' maps to more than a single token. Please use another string"
29+
), f"String '{string}' maps to more than a single token. Please use another string" """
3030

3131
return tokens[0, 1]
3232

@@ -57,8 +57,9 @@ def __init__(
5757
):
5858
super().__init__()
5959

60-
self.string_to_token_dict = {}
60+
self.embedder = embedder
6161

62+
self.string_to_token_dict = {}
6263
self.string_to_param_dict = nn.ParameterDict()
6364

6465
self.initial_embeddings = (
@@ -217,12 +218,28 @@ def save(self, ckpt_path):
217218

218219
def load(self, ckpt_path, full=True):
219220
ckpt = torch.load(ckpt_path, map_location='cpu')
220-
self.string_to_token_dict = ckpt["string_to_token"]
221-
self.string_to_param_dict = ckpt["string_to_param"]
221+
222+
# Handle .pt textual inversion files
223+
if 'string_to_token' in ckpt and 'string_to_param' in ckpt:
224+
self.string_to_token_dict = ckpt["string_to_token"]
225+
self.string_to_param_dict = ckpt["string_to_param"]
226+
227+
# Handle .bin textual inversion files from Huggingface Concepts
228+
# https://huggingface.co/sd-concepts-library
229+
else:
230+
for token_str in list(ckpt.keys()):
231+
token = get_clip_token_for_string(self.embedder.tokenizer, token_str)
232+
self.string_to_token_dict[token_str] = token
233+
ckpt[token_str] = torch.nn.Parameter(ckpt[token_str])
234+
235+
self.string_to_param_dict.update(ckpt)
236+
222237
if not full:
223238
for key, value in self.string_to_param_dict.items():
224239
self.string_to_param_dict[key] = torch.nn.Parameter(value.half())
225240

241+
print(f'Added terms: {", ".join(self.string_to_param_dict.keys())}')
242+
226243
def get_embedding_norms_squared(self):
227244
all_params = torch.cat(
228245
list(self.string_to_param_dict.values()), axis=0

0 commit comments

Comments
 (0)