-
Notifications
You must be signed in to change notification settings - Fork 358
Draft: feat: Support DBRX model in Llama #462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: feat: Support DBRX model in Llama #462
Conversation
f852b16 to
27b9d62
Compare
| + "Generation speed is significantly faster than LLaMA2-70B, while at the same time " | ||
| + "beating other open source models, such as, LLaMA2-70B, Mixtral, and Grok-1 on " | ||
| + "language understanding, programming, math, and logic.", | ||
| PromptTemplate.LLAMA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it uses ChatML prompt template - PromptTemplate.CHAT_ML
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, done
|
Since the change was recent, we need to update the llama.cpp submodule as well |
27b9d62 to
3bc8480
Compare
Done |
|
I'll try running the model locally soon and see if any other changes are necessary |
Great! But in this PR I have to implement downloading all 10 files first I guess... 😅 |
4fb52b2 to
7aa08d9
Compare
7aa08d9 to
05cdeed
Compare
05cdeed to
c87c1b1
Compare
|
@phymbert I can download https://huggingface.co/phymbert/dbrx-16x12b-instruct-iq3_xxs-gguf without login in the browser, but inside the plugin I get 403 Forbidden, is this to be expected with the |
|
Dbrx is a gated model, so I believe you have to pass a read token. There is an issue open on llama.cpp to support this. |
7683b53 to
6479604
Compare
e57fa37 to
c417cca
Compare
ea9d9ee to
b4dfde3
Compare
d923df0 to
17b179e
Compare
da70c82 to
ad16f5c
Compare
91e6831 to
4a62471
Compare
|
Closing this due to inactivity. |
The new Open Source model DBRX sounds amazing, is this enough and correct to integrate it into Llama?
ggml-org/llama.cpp#6515
https://huggingface.co/collections/phymbert/dbrx-16x12b-instruct-gguf-6619a7a4b7c50831dd33c7c8
https://www.databricks.com/blog/announcing-dbrx-new-standard-efficient-open-source-customizable-llms
https://github.com/databricks/dbrx
https://huggingface.co/collections/databricks/
llama.cpp seems to support splitted/sharded files, but I would need to download all of them first I suppose... 😅