A cross-platform desktop application providing a user-friendly GUI for Alibaba's Qwen-Image diffusion model. Built with Python/Gradio backend and designed for future Tauri desktop distribution.
- π¨ Easy Image Generation: Simple web interface for text-to-image generation
- π§ Smart Memory Management: CPU offloading for large models with GPU acceleration
- β‘ Auto-Configuration: Automatic GPU/CPU detection and optimization
- π¦ One-Click Setup: Automatic model download and caching
- π§ Advanced Controls: Multiple aspect ratios, CFG scale, steps, and seed controls
- π Real-Time Monitoring: Comprehensive logging and progress tracking
- π Cross-Platform: Works on Windows, macOS, and Linux
- π True Async Generation: ThreadPoolExecutor ensures UI interactions don't interrupt generation
The Qwen-Image model produces high-quality images with:
- High Resolutions: Up to 2048Γ2048 (1:1) and 2048Γ1152 (16:9) maximum quality
- Intuitive UI: Separate aspect ratio and resolution selection with live preview
- Multiple Aspect Ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3 with 4 quality levels each
- Advanced Controls: CFG scale, negative prompts, custom seeds
- Professional Quality: 4K, ultra HD, cinematic composition enhancement
- Python 3.10+ with pip
- PowerShell (Windows) for best compatibility
- 16GB+ RAM recommended
- NVIDIA GPU with 8GB+ VRAM (optional, but recommended)
- Clone the repository:
git clone https://github.com/DexterLagan/qwen-image-generator.git
cd qwen-image-generator
- Install dependencies:
pip install -r requirements.txt
- Launch the application:
python qwen_gui.py
-
Open your browser to
http://localhost:7860
-
First run: Click "Download & Load Model" (one-time ~4GB download)
Two versions of the Qwen-Image model are available:
- FP8 (~20GB) β https://huggingface.co/Qwen/Qwen-Image
- FP16 (~40GB) β https://huggingface.co/Qwen/Qwen-Image-FP16
Select the desired variant in the application before downloading.
The application automatically detects your hardware and optimizes accordingly:
- GPU Available (16GB+ VRAM): Full GPU acceleration
- GPU Available (<16GB VRAM): CPU offloading with GPU compute
- CPU Only: Sequential CPU offloading with memory optimization
- CPU Offloading: Stores large models in RAM, uses GPU for computation
- Attention Slicing: Reduces memory usage during generation
- Smart Caching: Reuses downloaded model files
- Memory Monitoring: Real-time RAM/VRAM usage reporting
qwen-image-generator/
βββ qwen_gui.py # π Main Gradio application
βββ qwen_sample_hf.py # π Official HuggingFace reference
βββ qwen_gradio_proto_chatgpt_wrong.py # π Legacy prototype (archived)
βββ test_launch.py # π§ͺ Basic functionality test
βββ requirements.txt # π¦ Python dependencies
βββ console.log # π Application logs
βββ model/cache/ # π€ Downloaded model files
βββ output/ # πΌοΈ Generated images
βββ screenshots/ # πΈ Documentation images
- API: Uses
diffusers.DiffusionPipeline
(not transformers) - Model: Qwen/Qwen-Image from Hugging Face
- Parameters: Official
true_cfg_scale
, aspect ratios, and resolutions - Caching: Intelligent model file verification and reuse
- PyTorch: CUDA-enabled version for GPU acceleration
- Gradio: Latest version with compatibility fixes
- Cross-Platform: Windows tested, Linux/macOS compatible
- Environment Variables: CUDA memory optimization settings
- Progressive Loading: Components loaded sequentially
- Error Recovery: Multiple fallback strategies for reliability
The application sets optimal configurations automatically:
# Gradio analytics disabled for privacy
os.environ['GRADIO_ANALYTICS_ENABLED'] = 'False'
# Note: CUDA expandable_segments removed for platform compatibility
- Windows:
model/cache/models--Qwen--Qwen-Image/
- Output:
output/qwen_image_YYYYMMDD_HHMMSS_seedXXXXXX.png
- Native Desktop App: Rust-based wrapper for system integration
- Embedded Runtime: Self-contained Python environment
- Cross-Platform Distribution: .exe, .app, and .deb packages
- Enhanced UX: Native file dialogs, system tray, auto-updater
- Tauri desktop wrapper implementation
- Batch image generation
- Image gallery and history
- Advanced prompt templates
- Model fine-tuning support
The following versions have been tested and work together:
torch==2.7.1
torchvision==0.22.1
transformers==4.55.0
accelerate==1.10.0
safetensors==0.5.3
gradio==5.41.1
psutil==7.0.0
For GPU support (though CPU-only mode is recommended for this model):
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
Symptoms: Process exits/crashes during model loading without error message Cause: Insufficient Windows virtual memory for large model loading Solution:
- Press
Win + R
, typesysdm.cpl
, press Enter - Advanced tab β Performance Settings β Advanced β Virtual Memory Change
- Uncheck "Automatically manage", select "Custom size"
- Set Initial:
50000
MB, Maximum:50000
MB - Click Set β OK β Restart computer Impact: CRITICAL - Model will not load without this fix
Symptoms: CUDA OOM errors despite CPU-only configuration Cause: Qwen-Image model (~20GB) too large for most GPUs Solution: Application automatically uses CPU-only mode User Action: Restart application if error persists
Symptoms: NotImplementedError during image generation Cause: Mixed GPU/CPU operations creating invalid tensor states Solution: Application uses pure CPU-only mode to prevent this User Action: Restart application, ensure CPU-only mode is active
Symptoms: Loading fails partway through, safetensors errors Cause: Interrupted downloads, disk issues, or network problems Solution: Click "ποΈ Clear Cache & Re-download" button Prevention: Ensure stable internet connection during initial download
Symptoms: "Unable to configure formatter" errors Cause: Logging conflicts between custom code and Gradio Solution: Fixed in current version with proper logging configuration User Action: Use PowerShell instead of CMD on Windows
- RAM: 24GB available RAM (32GB total recommended)
- Storage: 10GB free space (4GB model + 6GB temporary)
- CPU: Multi-core processor (Intel i5/AMD Ryzen 5 or better)
- OS: Windows 10/11 (PowerShell required)
- Virtual Memory: 50GB Windows page file (ESSENTIAL)
- Python: 3.10+ with pip
- Network: Stable connection for initial 4GB download
- Generation Time: 5-15 minutes per image on CPU
- Loading Time: 5-10 minutes first run (one-time)
- GPU Usage: Not recommended due to 20GB model size
- Logs: Check
console.log
for detailed operation information - Status: UI shows real-time model status and error details
- Memory: Application monitors and reports RAM usage
- Recovery: Automatic cache clearing on corruption detection
- Use PowerShell (not CMD) for better Unicode support
- Page file configuration is essential for model loading
- Antivirus: May quarantine model files, add exclusion for project folder
- Install packages may require
sudo
permissions - Virtual memory configuration varies by distribution
- Monitor
/tmp
space during model loading
Error | Cause | Solution |
---|---|---|
Process killed at shard 6/9 | Insufficient virtual memory | Increase Windows page file to 50GB |
CUDA out of memory | Model too large for GPU | Application auto-uses CPU mode |
Safetensors error | Corrupted model files | Use "Clear Cache & Re-download" button |
Network timeout | Unstable connection | Retry download, check internet |
Permission denied | Folder access issues | Run as administrator or check permissions |
Import errors | Missing dependencies | Reinstall with pip install -r requirements.txt |
- Close other applications to free RAM during loading
- Use SSD storage if available for faster model loading
- Disable antivirus real-time scanning for project folder temporarily
- Increase page file on fastest drive (usually C:) for better performance
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
The Qwen-Image model is licensed under the Qianwen License 1.0:
- β Free for personal, research, and evaluation use
- β Commercial use requires separate license from Alibaba
- Model: Alibaba Qwen Team (Qwen-Image)
- Framework: Hugging Face (Diffusers, Gradio)
- Future Desktop: Tauri
- Author: Dexter Santucci
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
If you encounter any issues:
- Check the
console.log
file for error details - Ensure you're using PowerShell on Windows
- Verify your Python and dependency versions
- Open an issue on GitHub with your log file