-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Description
What happened?
I ran into this issue when working on a PR on HF where I was adding GGUF support for phi3 model.
When using gguf-my-repo (or convert_hf_to_gguf.py) to convert from hugging face to gguf, merges is missing from the gguf file.
Below is an already converted TinyLlama-1.1B-Chat-v1.0-GGUF and you can see there is a merges section in the gguf tokenizer:
Here is a tinyllama I converted few days ago via gguf-my-repo & it is missing merges from tokenizer:
I was able to checkout llama.cpp & repro via:
python3.10 ./convert_hf_to_gguf.py TinyLlama-1.1B-Chat-v1.0 --outtype f16 --outfile TinyLlama-1.1B-Chat-v1.0-fp16.gguf
I am not familiar with the conversion script but I investigated and I think i understand the issue and I also have a fix:
-
Case where
tokenizer.modelis present:
This bug can happen for any model class that calls_set_vocab_sentencepiece(). For the case where atokenizer.modelis present,_create_vocab_sentencepiece()never throws an exception, and when we are back in_set_vocab_sentencepiece()load_merges is also not passed as True here, so this would be one place we would have to fix this. -
Case where
tokenizer.modelis not present andtokenizer.jsonis present:
This happens for the Llama family models only if_set_vocab_llama_hf()is invoked.
If the self._set_vocab_sentencepiece() which is wrapped by a try-catch inside the LlamaModel class fails which it does in my case since there is no tokenizer.model file for the llama model or phi3 but there is a tokenizer.json. For above case we can fix it in convert_hf_to_gguf.py#L806. I am able to fix it by passingload_merges=Trueto that line like:
special_vocab = gguf.SpecialVocab(self.dir_model, load_merges=True, n_vocab=len(tokens))
If the above fixes make sense, I can create a PR!
Name and Version
version: 3660 (b69a480)
built with Apple clang version 15.0.0 (clang-1500.0.40.1) for arm64-apple-darwin23.1.0
What operating system are you seeing the problem on?
Mac
Relevant log output
No response