Skip to content

Conversation

@mudler
Copy link
Owner

@mudler mudler commented May 16, 2025

Description

Latest llama.cpp release bring various API changes among super exciting features (thanks 🫶 @ngxson and @ggerganov!) , this called a completely rewrite of our grpc server to avoid drift with upstream.

The new implementation now is (almost) on par with what's on llama.cpp master, but now keeping things in sync its much easier.

Notes for Reviewers

In a next round, would be cool to upstream some architectural changes (like splitting the main server) from the http server, reducing even more maintenance on LocalAI's side.

#5368

Supersedes #5365

Signed commits

  • Yes, I signed my commits.

@netlify
Copy link

netlify bot commented May 16, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit df3ec72
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/68286306ed8e220008c39771
😎 Deploy Preview https://deploy-preview-5379--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Signed-off-by: Ettore Di Giacinto <[email protected]>
@mudler mudler mentioned this pull request May 16, 2025
@mudler mudler force-pushed the libmtmd-grpc-driftfree branch from 4488367 to dd381f8 Compare May 17, 2025 07:49
Signed-off-by: Ettore Di Giacinto <[email protected]>
@mudler mudler merged commit 6d5bde8 into master May 17, 2025
28 checks passed
@mudler mudler deleted the libmtmd-grpc-driftfree branch May 17, 2025 14:02
mudler added a commit to mudler/llama.cpp that referenced this pull request Jun 3, 2025
This is in order to improve maintainability and re-usability by
downstream projects such as LocalAI (see
mudler/LocalAI#5379 for context).

The context server is a struct that can be re-used quite heavily by
other communication protocols. For instance, LocalAI uses the context
server on top of gRPC rather than having a REST API. This change
improves overall re-usability by isolating the REST API to its own file
so the context server can be imported easily.

Signed-off-by: mudler <[email protected]>
mudler added a commit to mudler/llama.cpp that referenced this pull request Jun 3, 2025
This is in order to improve maintainability and re-usability by
downstream projects such as LocalAI (see
mudler/LocalAI#5379 for context).

The context server is a struct that can be re-used quite heavily by
other communication protocols. For instance, LocalAI uses the context
server on top of gRPC rather than having a REST API. This change
improves overall re-usability by isolating the REST API to its own file
so the context server can be imported easily.

Signed-off-by: mudler <[email protected]>
@mudler mudler added enhancement New feature or request and removed dependencies labels Jun 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant