-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Add docker protocol support for llama-server model loading #15790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
789373c to
f16b5b6
Compare
|
@CISC @ggerganov PTAL |
79f41b0 to
d10c249
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Docker registry support to llama-server, enabling users to pull and run AI models directly from Docker Hub using the docker:// protocol. The implementation handles Docker registry authentication, manifest parsing, and blob downloading to cache models locally.
- Adds Docker URL parsing and resolution functionality to download GGUF models from Docker registries
- Integrates Docker model resolution into the existing model loading pipeline
- Implements streaming download with proper authentication and caching support
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| common/common.cpp | Integrates Docker model resolution into the model loading pipeline and updates error messages |
| common/arg.h | Adds function declaration for Docker model resolution |
| common/arg.cpp | Implements complete Docker registry functionality including authentication, manifest parsing, and blob downloading |
2333af1 to
ab246cb
Compare
|
@JohannesGaessler @slaren PTAL |
|
@danbev PTAL |
|
@ggerganov @JohannesGaessler @slaren @danbev struggle to get this reviewed, if you guys have cycles I'd appreciate it. |
|
I'm also having a change of heart, thinking of changing to a: -d/--docker-repo option, at least the string would be the same then as suggested in Docker Hub and the one used in Docker Model Runner. It would also be more consistent with the huggingface argument approach. |
f60d560 to
d36c7aa
Compare
|
@ggerganov @danbev ready for re-review |
d36c7aa to
c01b8e5
Compare
c0e6f0b to
85ad516
Compare
efb7394 to
fdc9c55
Compare
|
Added resumable downloads in a second commit, models can be large and redownloading models from scratch on interrupted connections can be a pain and a waste of bandwidth |
603a693 to
e32f32f
Compare
3ba6076 to
9b15dfc
Compare
|
@ggerganov ready for re-review |
0bd8384 to
7e1f35c
Compare
ggerganov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker-related functions seem ok.
I'm not confident about the changes to common_download_file_single to support resumable downloads. Either wait for someone to review this part in details, or move it to a separate PR.
SGTM, I do think that resumable downloads bit is important in the next PR, whether it's huggingface, docker, etc. Somebody has to pay the cloud bill of all the wasted petabytes that are retransferred because of retries. And of course having to start a download from the start again because of an interrupted connection simply being annoying. Sometimes that can make larger models impossible to download for people. Even some servers have server side timeouts if you down finish the download in a certain time. Resumable downloads solves these things, client side. |
7e1f35c to
9278552
Compare
|
@ggerganov all done, it's just the docker pulling change now |
|
Getting build problems unrelated to this PR: x86_64 macOS Gonna try a rebuild. |
0327dd0 to
c728d2e
Compare
c728d2e to
93e0e58
Compare
|
@ggerganov green build! |
To pull and run models via: llama-server -dr gemma3 Add some validators and sanitizers for Docker Model urls and metadata Signed-off-by: Eric Curtin <[email protected]>
93e0e58 to
802e1fb
Compare
To pull and run models via: llama-server -dr gemma3
Add some validators and sanitizers for Docker Model urls and metadata