Llava has various quantized models in gguf format, so it can be used with Llama.cpp. https://github.com/ggerganov/llama.cpp/pull/3436 Is this possible?