Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter

By Jinglong Wang, Xiawei Li, Jing zhang, Qiangyuan Xu, Qin Zhou, Qian Yu, Sheng Lu, Dong Xu.

This repository is an official implementation of the paper Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter. And you're welcome to our project page.

New Paper🎉

We are thrilled to announce our latest paper "ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models", which explores the impact of original diffusion loss function. This work builds upon this repo and offers new insights into downstream tasks of diffusion models. Check it out here and explore how it enhances our understanding of diffusion model.

Citing DiffSegmenter

If you find DiffSegmenter useful in your research, please consider citing:

@misc{wang2023diffusion,
      title={Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter}, 
      author={Jinglong Wang and Xiawei Li and Jing Zhang and Qingyuan Xu and Qin Zhou and Qian Yu and Lu Sheng and Dong Xu},
      year={2023},
      eprint={2309.02773},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Installation

Requirements

Linux, CUDA>=11.7, GCC>=9.4
Python>=3.8

We recommend you to use Anaconda to create a conda environment:
```
conda create -n ldm python=3.8
```
Then, activate the environment:
```
conda activate ldm
```
Other requirements
```
pip install -r requirements.txt
```

Usage

Dataset preparation

Please download datasets and organize them as following:


├── COCO2014
│   ├── annotations
│   ├── coco_seg_anno
│   ├── images
│   │   ├── test2014
│   │   ├── train2014
│   │   └── val2014
│   └── mask
│       ├── train2014
│       └── val2014


└── VOCdevkit
    ├── VOC2010
    │   ├── Annotations
    │   ├── ImageSets
    │   │   ├── Action
    │   │   ├── Layout
    │   │   ├── Main
    │   │   ├── Segmentation
    │   │   └── SegmentationContext
    │   ├── JPEGImages
    │   ├── SegmentationClass
    │   ├── SegmentationClassContext
    │   └── SegmentationObject
    └── VOC2012
        ├── Annotations
        ├── ImageSets
        │   ├── Action
        │   ├── Layout
        │   ├── Main
        │   └── Segmentation
        ├── JPEGImages
        ├── SegmentationClass
        ├── SegmentationClassAug
        └── SegmentationObject

Open Vocabulary Semantic Segmentation

Evaluation

For the setting of Open Vocabulary Semantic Segmentation， our model does not require training; it directly produces segmentation results.

The ‘open_vocabulary’ folder contains code for open vocabulary semantic segmentation. It includes scripts for the voc, coco, and Pascal context datasets.

Taking the voc10 dataset as an example:

Step 1: Modify your dataset path in the Python file.

Step 2: Run ptp_stable_voc10.py to generate segmentation results.

python ptp_stable_voc10.py

Step 3: Run the evaluation script, remember to update the file path. MIoU will be recorded in eval.txt

python evaluation_voc10.py

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
dataset		dataset
open_vocabulary		open_vocabulary
visual_code		visual_code
README.md		README.md
ptp_utils.py		ptp_utils.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter

New Paper🎉

Citing DiffSegmenter

Installation

Requirements

Usage

Dataset preparation

Open Vocabulary Semantic Segmentation

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

VCG-team/DiffSegmenter

Folders and files

Latest commit

History

Repository files navigation

Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter

New Paper🎉

Citing DiffSegmenter

Installation

Requirements

Usage

Dataset preparation

Open Vocabulary Semantic Segmentation

Evaluation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages