Inference of MiniGPT4 in pure C/C++.
The main goal of minigpt4.cpp is to run minigpt4 using 4-bit quantization with using the ggml library.
Requirements: git
git clone --recursive https://github.com/Maknee/minigpt4.cpp
cd minigpt4.cppGo to Releases and extract minigpt4 library file into the repository directory.
Requirements: CMake, Visual Studio and Git
cmake .
cmake --build . --config Release
bin\Release\minigpt4.dll should be generated
Requirements: CMake (Ubuntu: sudo apt install cmake)
cmake .
cmake --build . --config Releaseminigpt4.so should be generated
Requirements: CMake (MacOS: brew install cmake)
cmake .
cmake --build . --config Releaseminigpt4.dylib should be generated
Note: If you build with opencv (allowing features such as loading and preprocessing image within the library itself), set MINIGPT4_BUILD_WITH_OPENCV to ON in CMakeLists.txt or build with -DMINIGPT4_BUILD_WITH_OPENCV=ON as a parameter to the cmake cli.
Pre-quantized models are available on Hugging Face ~ 7B or 13B.
Recommended for reliable results, but slow inference speed: minigpt4-13B-f16.bin
Requirements: Python 3.x and PyTorch.
Clone the MiniGPT-4 repository and perform the setup
cd minigpt4
git clone https://github.com/Vision-CAIR/MiniGPT-4.git
cd MiniGPT-4
conda env create -f environment.yml
conda activate minigpt4Download the pretrained checkpoint in the MiniGPT-4 repository under Checkpoint Aligned with Vicuna 7B or Checkpoint Aligned with Vicuna 13B or download them from Huggingface link for 7B or 13B
Convert the model weights into ggml format
7B model
cd minigpt4
python convert.py C:\pretrained_minigpt4_7b.pth --ftype=f16
13B model
cd minigpt4
python convert.py C:\pretrained_minigpt4.pth --ftype=f16
7B model
python convert.py ~/Downloads/pretrained_minigpt4_7b.pth --outtype f1613B model
python convert.py ~/Downloads/pretrained_minigpt4.pth --outtype f16minigpt4-7B-f16.bin or minigpt4-13B-f16.bin should be generated
Pre-quantized models are available on Hugging Face
Recommended for reliable results and decent inference speed: ggml-vicuna-13B-v0-q5_k.bin
Requirements: Python 3.x and PyTorch.
Follow the guide from the MiniGPT4 to obtain the vicuna-v0 model.
Then, clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake .
cmake --build . --config ReleaseConvert the model to ggml
python convert.py <path-to-model>Quantize the model
python quanitize <path-to-model> <output-model> Q4_1Test if minigpt4 works by calling the following, replacing minigpt4-13B-f16.bin and ggml-vicuna-13B-v0-q5_k.bin with your respective models
cd minigpt4
python minigpt4_library.py minigpt4-13B-f16.bin ggml-vicuna-13B-v0-q5_k.binInstall the requirements for the webui
pip install -r requirements.txtThen, run the webui, replacing minigpt4-13B-f16.bin and ggml-vicuna-13B-v0-q5_k.bin with your respective models
python webui.py minigpt4-13B-f16.bin ggml-vicuna-13B-v0-q5_k.binThe output should contain something like the following:
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.Go to http://127.0.0.1:7860 in your browser and you should be able to interact with the webui.

