Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
a3b17eb
Add example for GPULlama3ChatModel usage
mikepapadim Sep 2, 2025
7039560
Add example for GPULlama3StreamingChatModel usage
mikepapadim Sep 2, 2025
fd77fff
Add example project for GPULlama3 integration using Maven
mikepapadim Sep 2, 2025
c108b37
Add README and Maven module for GPULlama3 Java example
mikepapadim Sep 2, 2025
6bb222c
Enable GPU support in GPULlama3ChatModel example and comment out metr…
mikepapadim Sep 3, 2025
f1e743d
Expand README with instructions to run GPULlama3 example directly usi…
mikepapadim Sep 8, 2025
e16007f
Add comments specifying supported model formats and usage instruction…
mikepapadim Sep 8, 2025
92f1592
Update GPULlama3 example: integrate `exec-maven-plugin`, upgrade depe…
mikepapadim Sep 12, 2025
2bb0c53
Improve GPULlama3 examples: add prompt handling via arguments, enable…
mikepapadim Sep 12, 2025
65b58c6
Update GPULlama3 examples: add comments clarifying GPU usage option i…
mikepapadim Sep 12, 2025
53525a5
Expand README with instructions for running GPULlama3 example using T…
mikepapadim Sep 12, 2025
c694832
Expand README with instructions for running GPULlama3 example using T…
mikepapadim Sep 12, 2025
424752e
Use variable for langchain4j version in pom
orionpapadakis Sep 23, 2025
3c58a5b
Add a simple example with memory
orionpapadakis Sep 23, 2025
4a349c4
Add a more advanced example with conversation
orionpapadakis Sep 23, 2025
ef4221b
Merge branch 'main' of github.com:langchain4j/langchain4j-examples in…
mikepapadim Oct 7, 2025
b56a3f0
Update dependencies to langchain4j version 1.7.1-beta14
mikepapadim Oct 7, 2025
25b10a5
"Add environment variable support for specifying local model file pat…
orionpapadakis Oct 8, 2025
c1bb1e6
"Add setup instructions for configuring local LLMs in README"
orionpapadakis Oct 8, 2025
8ebea25
"Rename GPULlama3ChatModelExamples to GPULlama3ChatModelExample for c…
orionpapadakis Oct 8, 2025
affc866
"Rename GPULlama3StreamChatModelExamples to GPULlama3StreamChatModelE…
orionpapadakis Oct 8, 2025
2178d17
"Add validation for LOCAL_LLMS_PATH and update model file handling in…
orionpapadakis Oct 8, 2025
c65ed50
"Rename class and references from GPULlama3StreamChatModelExample to …
orionpapadakis Oct 8, 2025
674f755
"Add GPULlama3-based agent example (_1a_Basic_Agent_Example) with set…
orionpapadakis Oct 8, 2025
2c86883
"Set maxTokens to 1500 in GPULlama3ChatModelProvider"
orionpapadakis Oct 8, 2025
8013e1e
"Add GPULlama3-based structured CV generator example (_1b_Basic_Agent…
orionpapadakis Oct 8, 2025
5e1ca51
"Add GPULlama3-based sequential workflow agent example (_2a_Sequentia…
orionpapadakis Oct 8, 2025
0857e6e
Merge branch 'main' into examples/gpullama3.java
mikepapadim Oct 9, 2025
010cd43
Remove redundant dependency declaration for langchain4j-gpu-llama3 in…
mikepapadim Oct 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 169 additions & 0 deletions gpullama3.java-example/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# Running GPULlama3 Example Directly with Java

This guide explains how to run the `GPULlama3ChatModelExample` program **without using the Tornado launcher**, directly with `java`, using TornadoVM flags and the Maven-built JAR and dependencies.

---

## **Step 0 — Configure Local LLMs**

To download GPULlama3.java compatible LLMs, follow the instructions in the [GPULlama3.java README](https://github.com/beehive-lab/GPULlama3.java/blob/main/README.md).

For example to get llama3.2-1b:
```bash
wget https://huggingface.co/beehive-lab/Llama-3.2-1B-Instruct-GGUF-FP16/resolve/main/beehive-llama-3.2-1b-instruct-fp16.gguf
```

Export an environment variable:

```bash
export LOCAL_LLMS_PATH=/path/to/downloaded/local/llms
```

This environment variable will be used by the example applications.

## **Step 1 — Get Tornado JVM flags**

Run the following command (You need to have Tornado installed):

```bash
tornado --printJavaFlags
```

Example output:

```bash
/home/mikepapadim/.sdkman/candidates/java/current/bin/java -server \
-XX:-UseCompressedOops -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI \
-XX:-UseCompressedClassPointers --enable-preview \
-Djava.library.path=/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/lib \
--module-path .:/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/share/java/tornado \
-Dtornado.load.api.implementation=uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph \
-Dtornado.load.runtime.implementation=uk.ac.manchester.tornado.runtime.TornadoCoreRuntime \
-Dtornado.load.tornado.implementation=uk.ac.manchester.tornado.runtime.common.Tornado \
-Dtornado.load.annotation.implementation=uk.ac.manchester.tornado.annotation.ASMClassVisitor \
-Dtornado.load.annotation.parallel=uk.ac.manchester.tornado.api.annotations.Parallel \
--upgrade-module-path /home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/share/java/graalJars \
-XX:+UseParallelGC \
@/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/etc/exportLists/common-exports \
@/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/etc/exportLists/opencl-exports \
--add-modules ALL-SYSTEM,tornado.runtime,tornado.annotation,tornado.drivers.common,tornado.drivers.opencl
```

## **Step 2 — Build the Maven classpath**

First, build agentic-tutorial which is a dependency of gpullama3.java-example.

```bash
cd langchain4j-examples/agentic-tutorial
mvn clean install
```

From the project root, run:

```bash
mvn dependency:build-classpath -Dmdep.outputFile=cp.txt
```

## **Step 3 — Build the Maven classpath**

```bash
mvn clean package
```

Your main JAR will be located at:
```bash
target/gpullama3.java-example-1.4.0-beta10.jar
```

## **Step 4 — Run the program directly with Java**
You can now run the example with all JVM and Tornado flags:

```bash
JAVA_BIN=/home/mikepapadim/.sdkman/candidates/java/current/bin/java
CP="target/gpullama3.java-example-1.4.0-beta10.jar:$(cat cp.txt)"

$JAVA_BIN \
-server \
-XX:-UseCompressedOops \
-XX:+UnlockExperimentalVMOptions \
-XX:+EnableJVMCI \
--enable-preview \
-Djava.library.path=/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/lib \
--module-path .:/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/share/java/tornado \
-Dtornado.load.api.implementation=uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph \
-Dtornado.load.runtime.implementation=uk.ac.manchester.tornado.runtime.TornadoCoreRuntime \
-Dtornado.load.tornado.implementation=uk.ac.manchester.tornado.runtime.common.Tornado \
-Dtornado.load.annotation.implementation=uk.ac.manchester.tornado.annotation.ASMClassVisitor \
-Dtornado.load.annotation.parallel=uk.ac.manchester.tornado.api.annotations.Parallel \
--upgrade-module-path /home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/share/java/graalJars \
-XX:+UseParallelGC \
@/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/etc/exportLists/common-exports \
@/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/etc/exportLists/opencl-exports \
--add-modules ALL-SYSTEM,tornado.runtime,tornado.annotation,tornado.drivers.common,tornado.drivers.opencl \
-Duse.tornadovm=true \
-Xms6g -Xmx6g \
-Dtornado.device.memory=6GB \
-cp "$CP" \
GPULlama3ChatModelExample

```

### Optional: Create a shell script
You can save the above command as run-direct.sh and run it with:
```bash
bash run-direct.sh
```


### Optional: Run the program with TornadoVM
```bash
/home/mikepapadim/.sdkman/candidates/java/current/bin/java \
-server \
-XX:-UseCompressedOops \
-XX:+UnlockExperimentalVMOptions \
-XX:+EnableJVMCI \
--enable-preview \
-Djava.library.path=/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/lib \
--module-path .:/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/share/java/tornado \
-Dtornado.load.api.implementation=uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph \
-Dtornado.load.runtime.implementation=uk.ac.manchester.tornado.runtime.TornadoCoreRuntime \
-Dtornado.load.tornado.implementation=uk.ac.manchester.tornado.runtime.common.Tornado \
-Dtornado.load.annotation.implementation=uk.ac.manchester.tornado.annotation.ASMClassVisitor \
-Dtornado.load.annotation.parallel=uk.ac.manchester.tornado.api.annotations.Parallel \
--upgrade-module-path /home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/share/java/graalJars \
-XX:+UseParallelGC \
@/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/etc/exportLists/common-exports \
@/home/mikepapadim/java-ai-demos/GPULlama3.java/external/tornadovm/bin/sdk/etc/exportLists/opencl-exports \
--add-modules ALL-SYSTEM,tornado.runtime,tornado.annotation,tornado.drivers.common,tornado.drivers.opencl \
-Xms6g \
-Xmx6g \
-Dtornado.device.memory=6GB \
-cp "target/gpullama3.java-example-1.4.0-beta10.jar:/home/mikepapadim/.m2/repository/dev/langchain4j/langchain4j-core/1.5.0-SNAPSHOT/langchain4j-core-1.5.0-SNAPSHOT.jar:/home/mikepapadim/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.19.2/jackson-annotations-2.19.2.jar:/home/mikepapadim/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.19.2/jackson-core-2.19.2.jar:/home/mikepapadim/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.19.2/jackson-databind-2.19.2.jar:/home/mikepapadim/.m2/repository/org/slf4j/slf4j-api/2.0.17/slf4j-api-2.0.17.jar:/home/mikepapadim/.m2/repository/org/jspecify/jspecify/1.0.0/jspecify-1.0.0.jar:/home/mikepapadim/.m2/repository/dev/langchain4j/langchain4j-gpu-llama3/1.5.0-SNAPSHOT/langchain4j-gpu-llama3-1.5.0-SNAPSHOT.jar:/home/mikepapadim/.m2/repository/org/beehive/gpullama3/gpu-llama3/2.0-SNAPSHOT/gpu-llama3-2.0-SNAPSHOT.jar" \
GPULlama3StreamingChatModelExample

```

### Run agentic examples:

###### Note: Make sure you have the agentic-tutorial project built first (see step 2).

1) Run GPULlama3_1a_Basic_Agent_Example on GPU:

```bash
tornado -cp target/gpullamas.java-example-1.7.1-beta14.jar:$(cat cp.txt) \
agentic._1_basic_agent.GPULlama3_1a_Basic_Agent_Example GPU
```

2) Run GPULlama3_1b_Basic_Agent_Example_Structured on GPU:

```bash
tornado -cp target/gpullamas.java-example-1.7.1-beta14.jar:$(cat cp.txt) \
agentic._1_basic_agent.GPULlama3_1b_Basic_Agent_Example_Structured GPU
```

3) Run GPULlama3_2a_Sequential_Agent_Example on GPU:

```bash
tornado -cp target/gpullamas.java-example-1.7.1-beta14.jar:$(cat cp.txt) \
agentic._2_sequential_workflow.GPULlama3_2a_Sequential_Agent_Example GPU
```
56 changes: 56 additions & 0 deletions gpullama3.java-example/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-examples</artifactId>
<version>1.7.1-beta14</version>
</parent>

<artifactId>gpullama3.java-example</artifactId>

<properties>
<maven.compiler.source>21</maven.compiler.source>
<maven.compiler.target>21</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<langchain4j.version>1.7.1</langchain4j.version>
</properties>

<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>3.3.0</version>
<configuration>
<executable>${java.home}/bin/java</executable>
<mainClass>GPULlama3ChatModelExample</mainClass>
</configuration>
</plugin>
</plugins>
</build>

<dependencies>
<!-- GPU-accelerated Llama 3 -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-gpu-llama3</artifactId>
<version>1.7.1-beta14</version>
</dependency>

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-agentic</artifactId>
<version>1.7.1-beta14</version>
</dependency>

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>agentic-tutorial</artifactId>
<version>1.7.1</version> <!-- Match the version in agentic-tutorial's POM -->
</dependency>
</dependencies>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.gpullama3.GPULlama3ChatModel;

import java.nio.file.Path;

public class GPULlama3ChatModelExample {

public static void main(String[] args) {

// Read path to your *local* model files.
String localLLMsPath = System.getenv("LOCAL_LLMS_PATH");

// Check if the environment variable is set
if (localLLMsPath == null || localLLMsPath.isEmpty()) {
System.err.println("Error: LOCAL_LLMS_PATH environment variable is not set.");
System.err.println("Please set this environment variable to the directory containing your local model files.");
System.exit(1);
}

// Change this model file name to choose any of your *local* model files.
// Supports Mistral, Llama3, Phi-3, Qwen2.5 and Qwen3 in gguf format.
String modelFile = "beehive-llama-3.2-1b-instruct-fp16.gguf";
Path modelPath = Path.of(localLLMsPath, modelFile);

String prompt;

if (args.length > 0) {
prompt = args[0];
System.out.println("User Prompt: " + prompt);
} else {
prompt = "What is the capital of France?";
System.out.println("Example Prompt: " + prompt);
}



System.out.println("Path: " + modelPath);

// @formatter:off
ChatRequest request = ChatRequest.builder().messages(
UserMessage.from(prompt),
SystemMessage.from("reply with extensive sarcasm"))
.build();

//Path modelPath = Paths.get("beehive-llama-3.2-1b-instruct-fp16.gguf");


GPULlama3ChatModel model = GPULlama3ChatModel.builder()
.modelPath(modelPath)
.onGPU(Boolean.TRUE) //if false, runs on CPU though a lightweight implementation of llama3.java
.build();
// @formatter:on

ChatResponse response = model.chat(request);
System.out.println("\n" + response.aiMessage().text());

//Optionally print metrics
model.printLastMetrics();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
import dev.langchain4j.model.gpullama3.GPULlama3StreamingChatModel;

import java.nio.file.Path;
import java.util.concurrent.CompletableFuture;

public class GPULlama3StreamingChatModelExample {

public static void main(String[] args) {
// Read path to your *local* model files.
String localLLMsPath = System.getenv("LOCAL_LLMS_PATH");

// Check if the environment variable is set
if (localLLMsPath == null || localLLMsPath.isEmpty()) {
System.err.println("Error: LOCAL_LLMS_PATH environment variable is not set.");
System.err.println("Please set this environment variable to the directory containing your local model files.");
System.exit(1);
}

// Change this model file name to choose any of your *local* model files.
// Supports Mistral, Llama3, Phi-3, Qwen2.5 and Qwen3 in gguf format.
String modelFile = "beehive-llama-3.2-1b-instruct-fp16.gguf";
Path modelPath = Path.of(localLLMsPath, modelFile);

String prompt;

if (args.length > 0) {
prompt = args[0];
System.out.println("User Prompt: " + prompt);
} else {
prompt = "What is the capital of France?";
System.out.println("Example Prompt: " + prompt);
}

// @formatter:off
ChatRequest request = ChatRequest.builder().messages(
UserMessage.from(prompt),
SystemMessage.from("reply with extensive sarcasm"))
.build();

GPULlama3StreamingChatModel model = GPULlama3StreamingChatModel.builder()
.onGPU(Boolean.TRUE) // if false, runs on CPU though a lightweight implementation of llama3.java
.modelPath(modelPath)
.build();
// @formatter:on

CompletableFuture<ChatResponse> futureResponse = new CompletableFuture<>();

model.chat(request, new StreamingChatResponseHandler() {

@Override
public void onPartialResponse(String partialResponse) {
System.out.print(partialResponse);
}

@Override
public void onCompleteResponse(ChatResponse completeResponse) {
futureResponse.complete(completeResponse);
model.printLastMetrics();
}

@Override
public void onError(Throwable error) {
futureResponse.completeExceptionally(error);
}
});

futureResponse.join();
}
}
Loading