A minimal Java + HTML + JS chat app that shows how to use Docker Model Runner as a local LLM backend — fully integrated with LangChain4j.
Runs locally, uses your GPU (if available), and doesn’t depend on any cloud APIs. Looks like this
- 🧠 LangChain4j + local LLMs via Docker
- 🐳 Uses Docker Model Runner (lets you pull gemma3, phi4, DeepSeek, ...)
- ⚡ GPU acceleration (if available)
- 🔂 Chat memory retains 15 messages
- 🍃 Pure Java + vanilla JS + basic HTML
- 🧰 Minimal dependencies: Jackson + LangChain4j only
- Java 17+
- Docker Desktop 4.40+ (macOS / coming to Windows soon)
- Maven or Gradle
- Enable Model Runner feature
Go to Docker Desktop settings and enable the following:
✅
Settings → Features in Development → Enable Model Runner
✅Enable host-side TCP support(default port:12434)
docker model pull ai/gemma3or any other from Docker Hub AI
Once a model is pulled, it can be ran at anytime. Verify with:
docker model listmvn clean compile exec:java./gradlew runNow open http://localhost:8080 and start chatting.
If you're curious where the link to the local model is made, it's ChatServer.java code:
OpenAiChatModel.builder()
.apiKey("not needed")
.baseUrl("http://localhost:12434/engines/llama.cpp/v1")
.modelName("ai/gemma3") // or any pulled model
.build();It uses the Docker Model Runner's OpenAI-compatible endpoint (on port 12434) to route requests to your local model.
To use a different model:
docker model pull ai/llama3.2:latestThen update the code:
.modelName("ai/llama3.2:latest")And run the app again.
Install the DevoxxGenie in your IDEA via the marketplace


