From 43985eba9cc9c0fca021fc7f855824afe0f34dda Mon Sep 17 00:00:00 2001 From: Michele Rastelli Date: Tue, 5 Aug 2025 22:30:06 +0200 Subject: [PATCH 1/8] ArangoDB Tinkerpop Provider: initial documentation --- .../3.12/develop/integrations/_index.md | 9 + .../arangodb-tinkerpop-provider.md | 657 ++++++++++++++++++ 2 files changed, 666 insertions(+) create mode 100644 site/content/3.12/develop/integrations/arangodb-tinkerpop-provider.md diff --git a/site/content/3.12/develop/integrations/_index.md b/site/content/3.12/develop/integrations/_index.md index 635983cc4d..6d1325fbb5 100644 --- a/site/content/3.12/develop/integrations/_index.md +++ b/site/content/3.12/develop/integrations/_index.md @@ -47,3 +47,12 @@ allows you to export data from Apache Kafka to ArangoDB. - Repository: [github.com/arangodb/kafka-connect-arangodb/](https://github.com/arangodb/kafka-connect-arangodb/) - [Demo](https://github.com/arangodb/kafka-connect-arangodb/tree/main/demo) - [Changelog](https://github.com/arangodb/kafka-connect-arangodb/blob/main/ChangeLog.md) + +## Tinkerpop Provider + +The [**ArangoDB Tinkerpop Provider**](arangodb-tinkerpop-provider.md) is an implementation of +the [Apache TinkerPop OLTP Provider](https://tinkerpop.apache.org/docs/3.7.3/dev/provider) API +for ArangoDB. + +- Repository: [github.com/arangodb/arangodb-tinkerpop-provider](https://github.com/arangodb/arangodb-tinkerpop-provider) +- [Changelog](https://github.com/arangodb/arangodb-tinkerpop-provider/blob/main/CHANGELOG.md) diff --git a/site/content/3.12/develop/integrations/arangodb-tinkerpop-provider.md b/site/content/3.12/develop/integrations/arangodb-tinkerpop-provider.md new file mode 100644 index 0000000000..5b4546f1da --- /dev/null +++ b/site/content/3.12/develop/integrations/arangodb-tinkerpop-provider.md @@ -0,0 +1,657 @@ +--- +title: ArangoDB Tinkerpop Provider +menuTitle: Tinkerpop Provider +weight: 10 +description: >- + ArangoDB Tinkerpop Provider allows using standard TinkerPop API with ArangoDB storage +--- +ArangoDB TinkerPop Provider is an implementation of the [Apache TinkerPop OLTP Provider](https://tinkerpop.apache.org/docs/3.7.3/dev/provider) API for +ArangoDB. + +It allows using the standard TinkerPop API with ArangoDB as the backend storage. It supports creating, +querying, and manipulating graph data using the Gremlin traversal language, while offering the possibility to use native +AQL (ArangoDB Query Language) for complex queries. + +- Repository: +- [Code examples](https://github.com/arangodb/arangodb-tinkerpop-provider/tree/main/src/test/java/example) +- [Demo](https://github.com/arangodb/arangodb-tinkerpop-provider/tree/main/demo) +- [JavaDoc](https://www.javadoc.io/doc/com.arangodb/arangodb-tinkerpop-provider/latest/index.html) (generated reference documentation) +- [ChangeLog](https://github.com/arangodb/arangodb-tinkerpop-provider/blob/main/CHANGELOG.md) + +## Compatibility + +This Provider is compatible with: + +* Apache TinkerPop 3.7 +* ArangoDB 3.12+ +* ArangoDB Java Driver 7.22+ +* Java 8+ + +## Installation + +### Maven + +To add the provider to your project via Maven, include the following dependency (check +the [latest version here](https://search.maven.org/artifact/com.arangodb/arangodb-tinkerpop-provider)): + +```xml + + + + com.arangodb + arangodb-tinkerpop-provider + x.y.z + + +``` + +### Gradle + +For Gradle projects, add: + +```groovy +implementation 'com.arangodb:arangodb-tinkerpop-provider:x.y.z' +``` + +### Gremlin Console + +TODO (DE-1062) + +### Server Plugin + +TODO (DE-1061) + +## Quick Start + +Here's a simple example to get you started: + +[//]: <> (@formatter:off) +```java +// Create a configuration +Configuration conf = new ArangoDBConfigurationBuilder() + .hosts("localhost:8529") + .user("root") + .password("test") + .db("myDatabase") + .name("myGraph") + .enableDataDefinition(true) // Allow creating database and graph if they don't exist + .build(); + +// Create the graph +ArangoDBGraph graph = (ArangoDBGraph) GraphFactory.open(conf); + +// Get a traversal source +GraphTraversalSource g = graph.traversal(); + +// Add some data +Vertex person = g.addV("person") + .property("name", "Alice") + .property("age", 30) + .property("country", "Germany") + .next(); + +Vertex software = g.addV("software") + .property("name", "JArango") + .property("lang", "Java") + .next(); + +Edge created = g.addE("created") + .from(person) + .to(software) + .property("year",2025) + .next(); + +// Query the graph +List creators = g.V() + .hasLabel("software") + .has("name", "JArango") + .in("created") + .values("name") + .toList(); + +System.out.println("Creators: " + creators); + +// Find all software created by Alice +List aliceSoftware = g.V() + .hasLabel("person") + .has("name", "Alice") + .out("created") + .values("name") + .toList(); + +System.out.println("aliceSoftware: " + aliceSoftware); + +// Update a property +g.V() + .hasLabel("person") + .has("name","Alice") + .property("age",31) + .iterate(); + +// Remove a property +g.V() + .hasLabel("person") + .has("name","Alice") + .properties("country") + .drop() + .iterate(); + +Map alice = g.V() + .hasLabel("person") + .has("name","Alice") + .valueMap() + .next(); + +System.out.println("alice: " + alice); + +// Remove an edge +g.E() + .hasLabel("created") + .where(__.outV() + .has("name","Alice")) + .where(__.inV() + .has("name","JArango")) + .drop() + .iterate(); + +// Remove a vertex (and its incident edges) +g.V() + .hasLabel("person") + .has("name","Alice") + .drop() + .iterate(); + +// Close the graph when done +graph.close(); +``` +[//]: <> (@formatter:on) + +## Configuration + +The graph can be created using the methods from `org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(...)`( +see [javadoc](https://tinkerpop.apache.org/javadocs/3.7.3/full/org/apache/tinkerpop/gremlin/structure/util/GraphFactory.html)). +These methods accept a configuration file (e.g., YAML or properties file), a Java Map, or an Apache Commons +Configuration object. + +The property `gremlin.graph` must be set to: `com.arangodb.tinkerpop.gremlin.structure.ArangoDBGraph`. + +Configuration examples can be found [here](./src/test/java/example). + +### Graph Configuration Properties + +Graph configuration properties are prefixed with `gremlin.arangodb.conf.graph`: + +| Property | Description | Default | +|----------------------------------------------------|---------------------------------------|-------------| +| `gremlin.arangodb.conf.graph.db` | ArangoDB database name | `_system` | +| `gremlin.arangodb.conf.graph.name` | ArangoDB graph name | `tinkerpop` | +| `gremlin.arangodb.conf.graph.enableDataDefinition` | Flag to allow data definition changes | `false` | +| `gremlin.arangodb.conf.graph.type` | Graph type: `SIMPLE` or `COMPLEX` | `SIMPLE` | +| `gremlin.arangodb.conf.graph.orphanCollections` | List of orphan collections names | - | +| `gremlin.arangodb.conf.graph.edgeDefinitions` | List of edge definitions | - | + +### Driver Configuration Properties + +Driver configuration properties are prefixed with `gremlin.arangodb.conf.driver`. All properties from +`com.arangodb.config.ArangoConfigProperties` are supported. See +the [ArangoDB Java Driver documentation](https://docs.arangodb.com/stable/develop/drivers/java/reference-version-7/driver-setup/#config-file-properties) +for details. + +### YAML Configuration + +```yaml +gremlin: + graph: "com.arangodb.tinkerpop.gremlin.structure.ArangoDBGraph" + arangodb: + conf: + graph: + db: "testDb" + name: "myFirstGraph" + enableDataDefinition: true + type: COMPLEX + orphanCollections: [ "x", "y", "z" ] + edgeDefinitions: + - "e1:[a]->[b]" + - "e2:[a,b]->[c,d]" + driver: + user: "root" + password: "test" + hosts: + - "172.28.0.1:8529" + - "172.28.0.1:8539" + - "172.28.0.1:8549" +``` + +Loading from a YAML file: + +[//]: <> (@formatter:off) +```java +ArangoDBGraph graph = (ArangoDBGraph) GraphFactory.open(""); +``` +[//]: <> (@formatter:on) + +### Programmatic Configuration + +Using the configuration builder: + +[//]: <> (@formatter:off) +```java +Configuration conf = new ArangoDBConfigurationBuilder() + .hosts("172.28.0.1:8529") + .user("root") + .password("test") + .database("testDb") + .name("myGraph") + .graphType(GraphType.SIMPLE) + .enableDataDefinition(true) + .build(); + +ArangoDBGraph graph = (ArangoDBGraph) GraphFactory.open(conf); +``` +[//]: <> (@formatter:on) + +### SSL Configuration + +To use TLS-secured connections to ArangoDB, set `gremlin.arangodb.conf.driver.useSsl` to `true` and configure other +SSL-related properties as needed (see related +[documentation](https://docs.arangodb.com/stable/develop/drivers/java/reference-version-7/driver-setup/#config-file-properties)): + +```yaml +gremlin: + graph: "com.arangodb.tinkerpop.gremlin.structure.ArangoDBGraph" + arangodb: + conf: + driver: + hosts: + - "172.28.0.1:8529" + useSsl: true + verifyHost: false + sslCertValue: "MIIDezCCAmOgAwIBAgIEeDCzXzANBgkqhkiG9w0BAQsFADBuMRAwDgYDVQQGEwdVbmtub3duMRAwDgYDVQQIEwdVbmtub3duMRAwDgYDVQQHEwdVbmtub3duMRAwDgYDVQQKEwdVbmtub3duMRAwDgYDVQQLEwdVbmtub3duMRIwEAYDVQQDEwlsb2NhbGhvc3QwHhcNMjAxMTAxMTg1MTE5WhcNMzAxMDMwMTg1MTE5WjBuMRAwDgYDVQQGEwdVbmtub3duMRAwDgYDVQQIEwdVbmtub3duMRAwDgYDVQQHEwdVbmtub3duMRAwDgYDVQQKEwdVbmtub3duMRAwDgYDVQQLEwdVbmtub3duMRIwEAYDVQQDEwlsb2NhbGhvc3QwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQC1WiDnd4+uCmMG539ZNZB8NwI0RZF3sUSQGPx3lkqaFTZVEzMZL76HYvdc9Qg7difyKyQ09RLSpMALX9euSseD7bZGnfQH52BnKcT09eQ3wh7aVQ5sN2omygdHLC7X9usntxAfv7NzmvdogNXoJQyY/hSZff7RIqWH8NnAUKkjqOe6Bf5LDbxHKESmrFBxOCOnhcpvZWetwpiRdJVPwUn5P82CAZzfiBfmBZnB7D0l+/6Cv4jMuH26uAIcixnVekBQzl1RgwczuiZf2MGO64vDMMJJWE9ClZF1uQuQrwXF6qwhuP1Hnkii6wNbTtPWlGSkqeutr004+Hzbf8KnRY4PAgMBAAGjITAfMB0GA1UdDgQWBBTBrv9Awynt3C5IbaCNyOW5v4DNkTANBgkqhkiG9w0BAQsFAAOCAQEAIm9rPvDkYpmzpSIhR3VXG9Y71gxRDrqkEeLsMoEyqGnw/zx1bDCNeGg2PncLlW6zTIipEBooixIE9U7KxHgZxBy0Et6EEWvIUmnr6F4F+dbTD050GHlcZ7eOeqYTPYeQC502G1Fo4tdNi4lDP9L9XZpf7Q1QimRH2qaLS03ZFZa2tY7ah/RQqZL8Dkxx8/zc25sgTHVpxoK853glBVBs/ENMiyGJWmAXQayewY3EPt/9wGwV4KmU3dPDleQeXSUGPUISeQxFjy+jCw21pYviWVJTNBA9l5ny3GhEmcnOT/gQHCvVRLyGLMbaMZ4JrPwb+aAtBgrgeiK4xeSMMvrbhw==" +``` + +If no `sslCertValue` is provided, the default SSL context will be used. In such case, you can specify the truststore +using system properties `javax.net.ssl.trustStore` and `javax.net.ssl.trustStorePassword`. + +### Data Definition Management + +When a graph is instantiated, the provider compares existing data definitions in ArangoDB with the structure expected by +your configuration. It checks whether: + +- The database exists +- The graph exists +- The graph structure has the same edge definitions and orphan collections + +If there's a mismatch, an error is thrown and the graph will not be instantiated. To automatically create missing data +definitions, set `gremlin.arangodb.conf.graph.enableDataDefinition` to `true`. This allows: + +- Creating a new database if it doesn't exist +- Creating a new graph if it doesn't exist (along with vertex and edge collections) + +Existing graphs are never modified automatically. + +Collection names (vertex and edge collections) will be prefixed with the graph name if they aren't already. + +## Graph Types + +The ArangoDB TinkerPop Provider supports two graph types, which can be configured with the property +`gremlin.arangodb.conf.graph.type`: `SIMPLE` and `COMPLEX`. + +### SIMPLE Graph Type + +From an application perspective, this is the most flexible graph type that is backed by an ArangoDB graph composed of +only 1 vertex collection and 1 edge definition. + +It has the following advantages: + +- It closely matches the Tinkerpop property graph +- It is simpler to get started and run examples +- It imposes no restrictions about element IDs +- It supports arbitrary labels, i.e., labels not known at graph construction time + +It has the following disadvantages: + +- All vertex types will be stored in the same vertex collection +- All edge types will be stored in the same edge collection +- It could not leverage the full potential of ArangoDB graph traversal +- It could require an index on the `_label` field to improve performance + +Example configuration: + +```yaml +gremlin: + graph: "com.arangodb.tinkerpop.gremlin.structure.ArangoDBGraph" + arangodb: + conf: + graph: + db: "db" + name: "myGraph" + type: SIMPLE + edgeDefinitions: + - "e:[v]->[v]" +``` + +If `edgeDefinitions` are not configured, the default names will be used: + +- `_vertex` will be used for the vertex collection +- `_edge` will be used for the edge collection + +Using a `SIMPLE` graph configured as in the example above and creating a new element like: + +[//]: <> (@formatter:off) +```java +graph.addVertex("person", T.id, "foo"); +``` +[//]: <> (@formatter:on) + +would result in creating a document in the vertex collection `myGraph_v` with `_id` equals to `myGraph_v/foo`. + +### COMPLEX Graph Type + +The `COMPLEX` graph type is backed by an ArangoDB graph composed potentially of multiple vertex collections and multiple +edge definitions. It has the following advantages: + +- It closely matches the ArangoDB graph structure +- It allows multiple vertex collections and multiple edge collections +- It partitions the data in a finer way +- It allows indexing and sharding collections independently +- It can match pre-existing database graph structures + +But on the other side has the following constraints: + +- Element IDs must have the format: `_