Skip to content

Commit 7996a30

Browse files
authored
add auto-creation of mask for inpainting (CompVis#438)
* now use a single init image for both image and mask * turn on debugging for now to write out mask and image * add back -M option as a fallback
1 parent a69ca31 commit 7996a30

File tree

5 files changed

+191
-96
lines changed

5 files changed

+191
-96
lines changed

README.md

Lines changed: 106 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -22,22 +22,24 @@ text-to-image generator. This fork supports:
2222
generating images in your browser.
2323

2424
3. Support for img2img in which you provide a seed image to guide the
25-
image creation. (inpainting & masking coming soon)
25+
image creation
2626

27-
4. A notebook for running the code on Google Colab.
27+
4. Preliminary inpainting support.
2828

29-
5. Upscaling and face fixing using the optional ESRGAN and GFPGAN
29+
5. A notebook for running the code on Google Colab.
30+
31+
6. Upscaling and face fixing using the optional ESRGAN and GFPGAN
3032
packages.
3133

32-
6. Weighted subprompts for prompt tuning.
34+
7. Weighted subprompts for prompt tuning.
3335

34-
7. [Image variations](VARIATIONS.md) which allow you to systematically
36+
8. [Image variations](VARIATIONS.md) which allow you to systematically
3537
generate variations of an image you like and combine two or more
3638
images together to combine the best features of both.
3739

38-
8. Textual inversion for customization of the prompt language and images.
40+
9. Textual inversion for customization of the prompt language and images.
3941

40-
8. ...and more!
42+
10. ...and more!
4143

4244
This fork is rapidly evolving, so use the Issues panel to report bugs
4345
and make feature requests, and check back periodically for
@@ -75,9 +77,10 @@ log file of image names and prompts to the selected output directory.
7577
In addition, as of version 1.02, it also writes the prompt into the PNG
7678
file's metadata where it can be retrieved using scripts/images2prompt.py
7779

78-
The script is confirmed to work on Linux and Windows systems. It should
79-
work on MacOSX as well, but this is not confirmed. Note that this script
80-
runs from the command-line (CMD or Terminal window), and does not have a GUI.
80+
The script is confirmed to work on Linux, Windows and Mac
81+
systems. Note that this script runs from the command-line or can be used
82+
as a Web application. The Web GUI is currently rudimentary, but a much
83+
better replacement is on its way.
8184

8285
```
8386
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py
@@ -97,7 +100,7 @@ dream> "there's a fly in my soup" -n6 -g
97100
dream> q
98101
99102
# this shows how to retrieve the prompt stored in the saved image's metadata
100-
(ldm) ~/stable-diffusion$ python3 ./scripts/images2prompt.py outputs/img_samples/*.png
103+
(ldm) ~/stable-diffusion$ python ./scripts/images2prompt.py outputs/img_samples/*.png
101104
00009.png: "ashley judd riding a camel" -s150 -S 416354203
102105
00010.png: "ashley judd riding a camel" -s150 -S 1362479620
103106
00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
@@ -118,29 +121,68 @@ The script itself also recognizes a series of command-line switches
118121
that will change important global defaults, such as the directory for
119122
image outputs and the location of the model weight files.
120123

124+
## Hardware Requirements
125+
126+
You will need one of:
127+
128+
1. An NVIDIA-based graphics card with 8 GB or more of VRAM memory*.
129+
130+
2. An Apple computer with an M1 chip.**
131+
132+
3. At least 12 GB of main memory RAM.
133+
134+
4. At least 6 GB of free disk space for the machine learning model,
135+
python, and all its dependencies.
136+
137+
* If you are have a Nvidia 10xx series card (e.g. the 1080ti), please
138+
run the dream script in full-precision mode as shown below.
139+
140+
** Similarly, specify full-precision mode on Apple M1 hardware.
141+
142+
To run in full-precision mode, start dream.py with the
143+
--full_precision flag:
144+
145+
~~~~
146+
(ldm) ~/stable-diffusion$ python scripts/dream.py --full_precision
147+
~~~~
148+
121149
## Image-to-Image
122150

123151
This script also provides an img2img feature that lets you seed your
124-
creations with a drawing or photo. This is a really cool feature that tells
125-
stable diffusion to build the prompt on top of the image you provide, preserving
126-
the original's basic shape and layout. To use it, provide the --init_img
127-
option as shown here:
152+
creations with an initial drawing or photo. This is a really cool
153+
feature that tells stable diffusion to build the prompt on top of the
154+
image you provide, preserving the original's basic shape and
155+
layout. To use it, provide the --init_img option as shown here:
128156

129157
```
130158
dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
131159
```
132160

133-
The --init_img (-I) option gives the path to the seed picture. --strength (-f) controls how much
134-
the original will be modified, ranging from 0.0 (keep the original intact), to 1.0 (ignore the original
135-
completely). The default is 0.75, and ranges from 0.25-0.75 give interesting results.
161+
The --init_img (-I) option gives the path to the seed
162+
picture. --strength (-f) controls how much the original will be
163+
modified, ranging from 0.0 (keep the original intact), to 1.0 (ignore
164+
the original completely). The default is 0.75, and ranges from
165+
0.25-0.75 give interesting results.
136166

137-
You may also pass a -v<count> option to generate count variants on the original image. This is done by
138-
passing the first generated image back into img2img the requested number of times. It generates interesting
167+
You may also pass a -v<count> option to generate count variants on the
168+
original image. This is done by passing the first generated image back
169+
into img2img the requested number of times. It generates interesting
139170
variants.
140171

172+
If the initial image contains transparent regions, then Stable
173+
Diffusion will only draw within the transparent regions, a process
174+
called "inpainting". However, for this to work correctly, the color
175+
information underneath the transparent needs to be preserved, not
176+
erased. See [Creating Transparent Images for
177+
Inpainting](#creating-transparent-images-for-inpainting) for details.
178+
141179
## Seamless Tiling
142180

143-
The seamless tiling mode causes generated images to seamlessly tile with itself. To use it, add the --seamless option when starting the script which will result in all generated images to tile, or for each dream> prompt as shown here:
181+
The seamless tiling mode causes generated images to seamlessly tile
182+
with itself. To use it, add the --seamless option when starting the
183+
script which will result in all generated images to tile, or for each
184+
dream> prompt as shown here:
185+
144186
```
145187
dream> "pond garden with lotus by claude monet" --seamless -s100 -n4
146188
```
@@ -774,6 +816,49 @@ of branch>
774816
You will need to go through the install procedure again, but it should
775817
be fast because all the dependencies are already loaded.
776818

819+
# Creating Transparent Regions for Inpainting
820+
821+
Inpainting is really cool. To do it, you start with an initial image
822+
and use a photoeditor to make one or more regions transparent
823+
(i.e. they have a "hole" in them). You then provide the path to this
824+
image at the dream> command line using the -I switch. Stable Diffusion
825+
will only paint within the transparent region.
826+
827+
There's a catch. In the current implementation, you have to prepare
828+
the initial image correctly so that the underlying colors are
829+
preserved under the transparent area. Many imaging editing
830+
applications will by default erase the color information under the
831+
transparent pixels and replace them with white or black, which will
832+
lead to suboptimal inpainting. You also must take care to export the
833+
PNG file in such a way that the color information is preserved.
834+
835+
If your photoeditor is erasing the underlying color information,
836+
dream.py will give you a big fat warning. If you can't find a way to
837+
coax your photoeditor to retain color values under transparent areas,
838+
then you can combine the -I and -M switches to provide both the
839+
original unedited image and the masked (partially transparent) image:
840+
841+
~~~~
842+
dream> man with cat on shoulder -I./images/man.png -M./images/man-transparent.png
843+
~~~~
844+
845+
We are hoping to get rid of the need for this workaround in an
846+
upcoming release.
847+
848+
## Recipe for GIMP
849+
850+
GIMP is a popular Linux photoediting tool.
851+
852+
1. Open image in GIMP.
853+
2. Layer->Transparency->Add Alpha Channel
854+
2. Use lasoo tool to select region to mask
855+
3. Choose Select -> Float to create a floating selection
856+
4. Open the Layers toolbar (^L) and select "Floating Selection"
857+
5. Set opacity to 0%
858+
6. Export as PNG
859+
7. In the export dialogue, Make sure the "Save colour values from
860+
transparent pixels" checkbox is selected.
861+
777862
# Contributing
778863

779864
Anyone who wishes to contribute to this project, whether

ldm/dream/generator/inpaint.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,16 @@ def __init__(self,model):
1616

1717
@torch.no_grad()
1818
def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
19-
conditioning,init_image,init_mask,strength,
19+
conditioning,init_image,mask_image,strength,
2020
step_callback=None,**kwargs):
2121
"""
2222
Returns a function returning an image derived from the prompt and
2323
the initial image + mask. Return value depends on the seed at
2424
the time you call it. kwargs are 'init_latent' and 'strength'
2525
"""
2626

27-
init_mask = init_mask[0][0].unsqueeze(0).repeat(4,1,1).unsqueeze(0)
28-
init_mask = repeat(init_mask, '1 ... -> b ...', b=1)
27+
mask_image = mask_image[0][0].unsqueeze(0).repeat(4,1,1).unsqueeze(0)
28+
mask_image = repeat(mask_image, '1 ... -> b ...', b=1)
2929

3030
# PLMS sampler not supported yet, so ignore previous sampler
3131
if not isinstance(sampler,DDIMSampler):
@@ -66,7 +66,7 @@ def make_image(x_T):
6666
img_callback = step_callback,
6767
unconditional_guidance_scale = cfg_scale,
6868
unconditional_conditioning = uc,
69-
mask = init_mask,
69+
mask = mask_image,
7070
init_latent = self.init_latent
7171
)
7272
return self.sample_to_image(samples)

ldm/dream/pngwriter.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,14 +61,10 @@ def normalize_prompt(self):
6161
switches.append(f'-A{opt.sampler_name or t2i.sampler_name}')
6262
# to do: put model name into the t2i object
6363
# switches.append(f'--model{t2i.model_name}')
64-
if opt.invert_mask:
65-
switches.append(f'--invert_mask')
6664
if opt.seamless or t2i.seamless:
6765
switches.append(f'--seamless')
6866
if opt.init_img:
6967
switches.append(f'-I{opt.init_img}')
70-
if opt.mask:
71-
switches.append(f'-M{opt.mask}')
7268
if opt.fit:
7369
switches.append(f'--fit')
7470
if opt.strength and opt.init_img is not None:

0 commit comments

Comments
 (0)