-
Notifications
You must be signed in to change notification settings - Fork 13.8k
infill : add download instructions for model #6626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify that we download Codellama because it has infill support.
We can improve this by using latest CodeGemma models - I think they are smaller and also have infill support. It would be also a good exercise to verify that we support them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've update clarifying the usage of CodeLlama now 👍
Regarding CodeGemma, I tried out a smaller model
codegemma-2b-f16.ggufwhich did not produce a good result. I'll try a larger model, codegemma-7b-it-f16.gguf and see if it works (if I can run it). One thing I've noticed is that these are models are gated so I was not able to use thehfscript to download them but had to manually download them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking into this a little closer it looks like CodeGemma is using different ids for special tokens where as the ones that are specified in llama.cpp currently are:
For CodeGemma models perhaps this should be something like:
Should there be a check for CodeGemma model name in
llm_load_vocabto handle this perhaps, or what would be a good way to address this issue?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, this should be fixed - likely we need to add these special tokens in the GGUF meta data and start using them. Could you verify that using the correct token ids (by hardcoding them) produces correct results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following is the output when using the special token ids from CodeGemma:
This looks better I think and more inline with that the CodeLlama model outputs.
I'd be happy to take a stab at adding these special tokens to the GGUF meta data, if that is alright?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add these in the GGUF meta and then use inside
llama.cpp.Btw, the example that you posted seems kind of ok, but I would have expected it to after finishing the hello world print to create a new function
def goodbyeworld():. There might be some other issues stillUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give me a few pointer as to where this meta data is added? Is it added to gguf.py, like in constants.py similar to the other special tokens likeBOS_ID,EOS_ID,UNK_IDetc?I think I've figured this out and will dig in a little more and open a pull request.
I'll take a closer look at this as part of adding the metadata and see if I can figure out what is wrong.
I just realized that I downloaded the instruction following (it) model which is not the code completion model 😞
Using the code completion CodeGemma I get the following output:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #6689 with a suggestion.