Skip to content

Commit bc9a8ce

Browse files
patrickvonplatenpcuencasayakpaul
authored
[SD-XL] Add new pipelines (#3859)
* Add new text encoder * add transformers depth * More * Correct conversion script * Fix more * Fix more * Correct more * correct text encoder * Finish all * proof that in works in run local xl * clean up * Get refiner to work * Add red castle * Fix batch size * Improve pipelines more * Finish text2image tests * Add img2img test * Fix more * fix import * Fix embeddings for classic models (#3888) Fix embeddings for classic SD models. * Allow multiple prompts to be passed to the refiner (#3895) * finish more * Apply suggestions from code review * add watermarker * Model offload (#3889) * Model offload. * Model offload for refiner / img2img * Hardcode encoder offload on img2img vae encode Saves some GPU RAM in img2img / refiner tasks so it remains below 8 GB. --------- Co-authored-by: Patrick von Platen <[email protected]> * correct * fix * clean print * Update install warning for `invisible-watermark` * add: missing docstrings. * fix and simplify the usage example in img2img. * fix setup for watermarking. * Revert "fix setup for watermarking." This reverts commit 491bc9f. * fix: watermarking setup. * fix: op. * run make fix-copies. * make sure tests pass * improve convert * make tests pass * make tests pass * better error message * fiinsh * finish * Fix final test --------- Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Sayak Paul <[email protected]>
1 parent b62d9a1 commit bc9a8ce

28 files changed

+2512
-61
lines changed

.github/workflows/build_documentation.yml

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,20 @@ on:
99
- v*-patch
1010

1111
jobs:
12-
build:
13-
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
14-
with:
15-
commit_sha: ${{ github.sha }}
16-
package: diffusers
17-
notebook_folder: diffusers_doc
18-
languages: en ko zh
12+
build:
13+
steps:
14+
- name: Install dependencies
15+
run: |
16+
apt-get update && apt-get install libsndfile1-dev libgl1 -y
17+
18+
- name: Build doc
19+
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
20+
with:
21+
commit_sha: ${{ github.sha }}
22+
package: diffusers
23+
notebook_folder: diffusers_doc
24+
languages: en ko zh
25+
1926
secrets:
2027
token: ${{ secrets.HUGGINGFACE_PUSH }}
2128
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}

.github/workflows/build_pr_documentation.yml

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,15 @@ concurrency:
99

1010
jobs:
1111
build:
12-
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
13-
with:
14-
commit_sha: ${{ github.event.pull_request.head.sha }}
15-
pr_number: ${{ github.event.number }}
16-
package: diffusers
17-
languages: en ko
12+
steps:
13+
- name: Install dependencies
14+
run: |
15+
apt-get update && apt-get install libsndfile1-dev libgl1 -y
16+
17+
- name: Build doc
18+
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
19+
with:
20+
commit_sha: ${{ github.event.pull_request.head.sha }}
21+
pr_number: ${{ github.event.number }}
22+
package: diffusers
23+
languages: en ko zh

.github/workflows/pr_tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ jobs:
6262

6363
- name: Install dependencies
6464
run: |
65-
apt-get update && apt-get install libsndfile1-dev -y
65+
apt-get update && apt-get install libsndfile1-dev libgl1 -y
6666
python -m pip install -e .[quality,test]
6767
6868
- name: Environment

docker/diffusers-pytorch-cpu/Dockerfile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ RUN apt update && \
1414
libsndfile1-dev \
1515
python3.8 \
1616
python3-pip \
17+
libgl1 \
1718
python3.8-venv && \
1819
rm -rf /var/lib/apt/lists
1920

@@ -27,6 +28,7 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
2728
torch \
2829
torchvision \
2930
torchaudio \
31+
invisible_watermark \
3032
--extra-index-url https://download.pytorch.org/whl/cpu && \
3133
python3 -m pip install --no-cache-dir \
3234
accelerate \
@@ -40,4 +42,4 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
4042
tensorboard \
4143
transformers
4244

43-
CMD ["/bin/bash"]
45+
CMD ["/bin/bash"]

docker/diffusers-pytorch-cuda/Dockerfile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ RUN apt update && \
1212
curl \
1313
ca-certificates \
1414
libsndfile1-dev \
15+
libgl1 \
1516
python3.8 \
1617
python3-pip \
1718
python3.8-venv && \
@@ -26,7 +27,8 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
2627
python3 -m pip install --no-cache-dir \
2728
torch \
2829
torchvision \
29-
torchaudio && \
30+
torchaudio \
31+
invisible_watermark && \
3032
python3 -m pip install --no-cache-dir \
3133
accelerate \
3234
datasets \
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# Stable diffusion XL
14+
15+
Stable Diffusion 2 is a text-to-image _latent diffusion_ model built upon the work of [Stable Diffusion 1](https://stability.ai/blog/stable-diffusion-public-release).
16+
The project to train Stable Diffusion 2 was led by Robin Rombach and Katherine Crowson from [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/).
17+
18+
*The Stable Diffusion 2.0 release includes robust text-to-image models trained using a brand new text encoder (OpenCLIP), developed by LAION with support from Stability AI, which greatly improves the quality of the generated images compared to earlier V1 releases. The text-to-image models in this release can generate images with default resolutions of both 512x512 pixels and 768x768 pixels.
19+
These models are trained on an aesthetic subset of the [LAION-5B dataset](https://laion.ai/blog/laion-5b/) created by the DeepFloyd team at Stability AI, which is then further filtered to remove adult content using [LAION’s NSFW filter](https://openreview.net/forum?id=M3Y74vmsMcY).*
20+
21+
For more details about how Stable Diffusion 2 works and how it differs from Stable Diffusion 1, please refer to the official [launch announcement post](https://stability.ai/blog/stable-diffusion-v2-release).
22+
23+
## Tips
24+
25+
### Available checkpoints:
26+
27+
- *Text-to-Image (1024x1024 resolution)*: [stabilityai/stable-diffusion-xl-base-0.9](https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9) with [`StableDiffusionXLPipeline`]
28+
- *Image-to-Image / Refiner (1024x1024 resolution)*: [stabilityai/stable-diffusion-xl-refiner-0.9](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-0.9) with [`StableDiffusionXLImg2ImgPipeline`]
29+
30+
TODO
31+
32+
## StableDiffusionXLPipeline
33+
34+
[[autodoc]] StableDiffusionXLPipeline
35+
- all
36+
- __call__
37+
38+
## StableDiffusionXLImg2ImgPipeline
39+
40+
[[autodoc]] StableDiffusionXLImg2ImgPipeline
41+
- all
42+
- __call__

scripts/convert_original_stable_diffusion_to_diffusers.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,13 @@
126126
"--controlnet", action="store_true", default=None, help="Set flag if this is a controlnet checkpoint."
127127
)
128128
parser.add_argument("--half", action="store_true", help="Save weights in half precision.")
129+
parser.add_argument(
130+
"--vae_path",
131+
type=str,
132+
default=None,
133+
required=False,
134+
help="Set to a path, hub id to an already converted vae to not convert it again.",
135+
)
129136
args = parser.parse_args()
130137

131138
pipe = download_from_original_stable_diffusion_ckpt(
@@ -144,6 +151,7 @@
144151
stable_unclip_prior=args.stable_unclip_prior,
145152
clip_stats_path=args.clip_stats_path,
146153
controlnet=args.controlnet,
154+
vae_path=args.vae_path,
147155
)
148156

149157
if args.half:

setup.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@
8989
"huggingface-hub>=0.13.2",
9090
"requests-mock==1.10.0",
9191
"importlib_metadata",
92+
"invisible-watermark",
9293
"isort>=5.5.4",
9394
"jax>=0.2.8,!=0.3.2",
9495
"jaxlib>=0.1.65",
@@ -193,6 +194,7 @@ def run(self):
193194
"compel",
194195
"datasets",
195196
"Jinja2",
197+
"invisible-watermark",
196198
"k-diffusion",
197199
"librosa",
198200
"omegaconf",

src/diffusers/__init__.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
OptionalDependencyNotAvailable,
66
is_flax_available,
77
is_inflect_available,
8+
is_invisible_watermark_available,
89
is_k_diffusion_available,
910
is_k_diffusion_version,
1011
is_librosa_available,
@@ -179,6 +180,14 @@
179180
VQDiffusionPipeline,
180181
)
181182

183+
try:
184+
if not (is_torch_available() and is_transformers_available() and is_invisible_watermark_available()):
185+
raise OptionalDependencyNotAvailable()
186+
except OptionalDependencyNotAvailable:
187+
from .utils.dummy_torch_and_transformers_and_invisible_watermark_objects import * # noqa F403
188+
else:
189+
from .pipelines import StableDiffusionXLImg2ImgPipeline, StableDiffusionXLPipeline
190+
182191
try:
183192
if not (is_torch_available() and is_transformers_available() and is_k_diffusion_available()):
184193
raise OptionalDependencyNotAvailable()

src/diffusers/dependency_versions_table.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
"huggingface-hub": "huggingface-hub>=0.13.2",
1414
"requests-mock": "requests-mock==1.10.0",
1515
"importlib_metadata": "importlib_metadata",
16+
"invisible-watermark": "invisible-watermark",
1617
"isort": "isort>=5.5.4",
1718
"jax": "jax>=0.2.8,!=0.3.2",
1819
"jaxlib": "jaxlib>=0.1.65",

0 commit comments

Comments
 (0)