Skip to content

ollama or llama cpp with a MacBook M1 Pro? #166835

Closed Answered by Koarra
KKoara asked this question in Models
Discussion options

You must be logged in to vote

Use Ollama if:
You want quick setup, ease of use, and clean integration with tools like LangChain.
- You're focused on building prototypes or apps rather than tuning performance.
- You want automatic GPU (Metal) acceleration without fuss.

Use llama.cpp if:

  • You want maximum control and performance tuning (e.g., custom quantization, batch sizes).
  • You're okay with compiling from source and managing models manually.
  • You don’t need an API — just local CLI use or embedding in your own code

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by KKoara
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Models
Labels
General General topics and discussions that don't fit into other categories, but are related to GitHub Models Discussions related to GitHub Models
2 participants