AdaVIB

This repo contains the code for the paper: Mitigating Hallucinations in Large Vision-Language Models by Adaptively Constraining Information Flow. AAAI 2025

Requirements

This code is built upon Pytorch-Lightning framework, tested on Python 3.9, Pytorch 2.3.0 and transformers 4.28.0.

pip install -r requirements.txt

Resources

1. Datasets

The training and evaluation datasets are from LLaVA-Instruct-150K and POPE. The grounded images of these datasets can be downloaded from coco2014. Downloading the train and val images, and put them under the corresponding paths.

The train images should be put under the path you specified in

minigpt4/configs/datasets/coco2014/align.yaml

The val images should be put under the path you specified in

minigpt4/configs/datasets/pope_eval/align.yaml

Note that the train and evaluation data sampled from LLaVa-Instruct-150K should be put under the same path as the train images.

2. Models

MiniGPT-4: Download the Vicuna-7B, Q-Former and the VIT, then putting them under the path you specified in

minigpt4/configs/models/minigpt4.yaml

Specifying the path to the corresponding keys:

llama_model: <Path to vicuna-7b>
q_former_model: <Path to blip2_pretrained_flant5xxl.pth>
vit_model: <Path to eva_vit_g.pth>

Additionally, you need to download the pre-trained vision-language projector of MiniGPT-4 from here, and put it under the path you specified in the following files:

minigpt4_configs/minigpt4_stage2_finetune.yaml
minigpt4_configs/minigpt4_pope_eval.yaml
minigpt4_configs/minigpt4_coco_eval.yaml

The path should be delivered to the ckpt in *.yaml. For example:

ckpt: <Path to the prerained_minigpt4_7b.pth>

Training

Following the default settings in minigpt4_configs/minigpt4_stage2_finetune.yaml, specify output_dir in this file and run the following to train the model:

export CUDA_VISIBLE_DEVICES=0
python finetune.py --cfg_path minigpt4_configs/minigpt4_stage2_finetune.yaml

This code supports multi-GPU training. Simply specify the GPUs' id you used.

Inference

Once the training is finished, you will get a .ckpt file under the output_dir, a checkpoint of the vision-language projector you trained before. Specify the path to this checkpoint under the following file:

minigpt4_configs/minigtp4_pope_eval.yaml
minigpt4_configs/minigtp4_coco_eval.yaml

The path of the checkpoint file should be delivered to the lightning_ckpt in minigpt4_configs/*_eval.yaml, for example:

lightning_ckpt: <Path to the trained_projector.ckpt>

Run the following command to conduct model inference on the POPE and the COCO, respectively:

python pope_eval.py --cfg_path minigpt4_configs/minigpt4_pope_eval.yaml
python coco_eval.py --cfg_path minigpt4_configs/minigpt4_coco_eval.yaml

Acknowledgement

Many thanks to the following projects, this project is partially built upon them.

MiniGPT4; LURE; LLaVa; POPE; CHAIR;

Citation

If you find our work helpful, please use the following citations.

@article{bai2025mitigating,
  title={Mitigating Hallucinations in Large Vision-Language Models by Adaptively Constraining Information Flow},
  author={Bai, Jiaqi and Guo, Hongcheng and Peng, Zhongyuan and Yang, Jian and Li, Zhoujun and Li, Mohan and Tian, Zhihong},
  journal={arXiv preprint arXiv:2502.20750},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
chair_eval		chair_eval
minigpt4		minigpt4
minigpt4_configs		minigpt4_configs
pope_coco		pope_coco
LICENSE		LICENSE
README.md		README.md
callback.py		callback.py
coco_eval.py		coco_eval.py
finetune.py		finetune.py
minigpt4_module.py		minigpt4_module.py
pope_eval.py		pope_eval.py
requirements.txt		requirements.txt
sampler.py		sampler.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AdaVIB

Requirements

Resources

1. Datasets

2. Models

Training

Inference

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

jiaqi5598/AdaVIB

Folders and files

Latest commit

History

Repository files navigation

AdaVIB

Requirements

Resources

1. Datasets

2. Models

Training

Inference

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages