- Checkout and compile IREE with release build and
export PATH=/path/to/iree/build/release/tools:$PATH - Compile the full SDXL model:
./compile-txt2img.sh gfx942(wheregfx942is the target for MI300X) - Run the benchmark:
./benchmark-txt2img.sh N /path/to/weights/irpa(whereNis the GPU index)
Caution
IRs in the following table might be stale. Use the ones in the
base_ir/ directory instead.
Note
SDXL-turbo is only different from SDXL in its usage and training/weights. The model architecture (and therefore the weights-stripped MLIR) are equivalent.
| Variant | Submodel | MLIR (No Weights) (Config A) | safetensors | Splat IRPA | MLIR (No Weights) (Config B) |
|---|---|---|---|---|---|
| SDXL1.0 1024x1024 (f16, BS1, len64) | |||||
| UNet + attn | Torch - Linalg | - | - | Azure | |
| UNet + PNDMScheduler | Azure | ||||
| Clip1 | Azure | - | - | ||
| Clip2 | Azure | - | - | ||
| VAE decode + attn | Azure | - | = | Azure | |
| VAE encode + attn | [GCloud][sdxl-1-1024x1024-f16-stripped-weight-vae-encode] | Same as decode | - | - | |
| SDXL1.0 1024x1024 (f32, BS1, len64) | |||||
| UNet + attn | Azure | Azure | Azure | Azure | |
| Clip1 | Azure | Azure | Azure | - | |
| Clip2 | Azure | Azure | Azure | - | |
| VAE decode + attn | Azure | Azure | Azure | Azure | |
| SDXL compiled pipeline IRPAs (f16) | |||||
| UNet | scheduled_unet_f16.irpa | ||||
| Prompt Encoder (CLIP1 + CLIP2) | prompt_encoder_f16.irpa | ||||
| VAE | vae_decode_f16.irpa |