Skip to content

Conversation

@ericcurtin
Copy link
Collaborator

@ericcurtin ericcurtin commented Sep 4, 2025

To pull and run models via: llama-server -dr gemma3
Add some validators and sanitizers for Docker Model urls and metadata

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 3 times, most recently from 789373c to f16b5b6 Compare September 4, 2025 11:43
@ericcurtin
Copy link
Collaborator Author

@CISC @ggerganov PTAL

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 2 times, most recently from 79f41b0 to d10c249 Compare September 4, 2025 12:44
@ericcurtin ericcurtin requested a review from Copilot September 4, 2025 12:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Docker registry support to llama-server, enabling users to pull and run AI models directly from Docker Hub using the docker:// protocol. The implementation handles Docker registry authentication, manifest parsing, and blob downloading to cache models locally.

  • Adds Docker URL parsing and resolution functionality to download GGUF models from Docker registries
  • Integrates Docker model resolution into the existing model loading pipeline
  • Implements streaming download with proper authentication and caching support

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
common/common.cpp Integrates Docker model resolution into the model loading pipeline and updates error messages
common/arg.h Adds function declaration for Docker model resolution
common/arg.cpp Implements complete Docker registry functionality including authentication, manifest parsing, and blob downloading

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 3 times, most recently from 2333af1 to ab246cb Compare September 4, 2025 13:50
@ericcurtin
Copy link
Collaborator Author

@JohannesGaessler @slaren PTAL

@ericcurtin
Copy link
Collaborator Author

@danbev PTAL

@ericcurtin
Copy link
Collaborator Author

@ggerganov @JohannesGaessler @slaren @danbev struggle to get this reviewed, if you guys have cycles I'd appreciate it.

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Sep 9, 2025

I'm also having a change of heart, thinking of changing to a:

-d/--docker-repo

option, at least the string would be the same then as suggested in Docker Hub and the one used in Docker Model Runner. It would also be more consistent with the huggingface argument approach.

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 2 times, most recently from f60d560 to d36c7aa Compare September 9, 2025 14:14
@ericcurtin
Copy link
Collaborator Author

@ggerganov @danbev ready for re-review

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch from d36c7aa to c01b8e5 Compare September 9, 2025 14:18
@ericcurtin ericcurtin changed the title Add docker:// protocol support for llama-server model pulling Add docker protocol support for llama-server model loading Sep 9, 2025
@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 6 times, most recently from c0e6f0b to 85ad516 Compare September 9, 2025 17:30
@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch from efb7394 to fdc9c55 Compare September 9, 2025 22:22
@ericcurtin
Copy link
Collaborator Author

Added resumable downloads in a second commit, models can be large and redownloading models from scratch on interrupted connections can be a pain and a waste of bandwidth

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 5 times, most recently from 603a693 to e32f32f Compare September 10, 2025 09:27
@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 2 times, most recently from 3ba6076 to 9b15dfc Compare September 10, 2025 09:56
@ericcurtin
Copy link
Collaborator Author

@ggerganov ready for re-review

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 3 times, most recently from 0bd8384 to 7e1f35c Compare September 10, 2025 12:02
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker-related functions seem ok.

I'm not confident about the changes to common_download_file_single to support resumable downloads. Either wait for someone to review this part in details, or move it to a separate PR.

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Sep 11, 2025

The docker-related functions seem ok.

I'm not confident about the changes to common_download_file_single to support resumable downloads. Either wait for someone to review this part in details, or move it to a separate PR.

SGTM, I do think that resumable downloads bit is important in the next PR, whether it's huggingface, docker, etc. Somebody has to pay the cloud bill of all the wasted petabytes that are retransferred because of retries.

And of course having to start a download from the start again because of an interrupted connection simply being annoying.

Sometimes that can make larger models impossible to download for people.

Even some servers have server side timeouts if you down finish the download in a certain time.

Resumable downloads solves these things, client side.

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch from 7e1f35c to 9278552 Compare September 11, 2025 12:39
@ericcurtin
Copy link
Collaborator Author

@ggerganov all done, it's just the docker pulling change now

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Sep 11, 2025

Getting build problems unrelated to this PR:

/Users/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-blas/ggml-blas.cpp:143:13: error: 'cblas_sgemm' is only available on macOS 13.3 or newer [-Werror,-Wunguarded-availability-new]
            cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasTrans,
            ^~~~~~~~~~~

x86_64 macOS

Gonna try a rebuild.

@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch 2 times, most recently from 0327dd0 to c728d2e Compare September 11, 2025 13:39
@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch from c728d2e to 93e0e58 Compare September 11, 2025 14:15
@ericcurtin
Copy link
Collaborator Author

@ggerganov green build!

To pull and run models via: llama-server -dr gemma3
Add some validators and sanitizers for Docker Model urls and metadata

Signed-off-by: Eric Curtin <[email protected]>
@ericcurtin ericcurtin force-pushed the docker-pull-functionality branch from 93e0e58 to 802e1fb Compare September 12, 2025 15:31
@ericcurtin ericcurtin merged commit 4bf5549 into master Sep 12, 2025
45 of 48 checks passed
@ericcurtin ericcurtin deleted the docker-pull-functionality branch September 12, 2025 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants