diff --git a/site/content/3.12/develop/foxx-microservices/guides/authentication-and-sessions.md b/site/content/3.12/develop/foxx-microservices/guides/authentication-and-sessions.md
index 2307728d20..9b77d01f3e 100644
--- a/site/content/3.12/develop/foxx-microservices/guides/authentication-and-sessions.md
+++ b/site/content/3.12/develop/foxx-microservices/guides/authentication-and-sessions.md
@@ -23,7 +23,7 @@ authentication.
In this example we'll use two collections: a `users` collection to store the
user objects with names and credentials, and a `sessions` collection to store
the session data. We'll also make sure usernames are unique
-by adding a hash index:
+by adding a `persistent` index:
```js
"use strict";
@@ -37,7 +37,7 @@ if (!db._collection(sessions)) {
db._createDocumentCollection(sessions);
}
module.context.collection("users").ensureIndex({
- type: "hash",
+ type: "persistent",
unique: true,
fields: ["username"]
});
diff --git a/site/content/3.12/develop/http-api/indexes/_index.md b/site/content/3.12/develop/http-api/indexes/_index.md
index f759f7e174..a35384f3d0 100644
--- a/site/content/3.12/develop/http-api/indexes/_index.md
+++ b/site/content/3.12/develop/http-api/indexes/_index.md
@@ -221,8 +221,8 @@ paths:
insert a value into the index that already exists in the index always fails,
regardless of the value of this attribute.
- The optional **estimates** attribute is supported by persistent indexes.
- This attribute controls whether index selectivity estimates are
+ The optional **estimates** attribute is supported by `persistent`, `mdi`, and
+ `mdi-prefixed` indexes. This attribute controls whether index selectivity estimates are
maintained for the index. Not maintaining index selectivity estimates can have
a slightly positive impact on write performance.
The downside of turning off index selectivity estimates will be that
diff --git a/site/content/3.12/develop/http-api/indexes/multi-dimensional.md b/site/content/3.12/develop/http-api/indexes/multi-dimensional.md
index e38f031495..5627af291b 100644
--- a/site/content/3.12/develop/http-api/indexes/multi-dimensional.md
+++ b/site/content/3.12/develop/http-api/indexes/multi-dimensional.md
@@ -111,7 +111,10 @@ paths:
default: false
sparse:
description: |
- If `true`, then create a sparse index.
+ Whether to create a sparse index that excludes documents with
+ at least one of the attributes for indexing missing or set to
+ `null`. These attributes are defined by `fields` and (for
+ `mdi-prefixed` indexes) by `prefixFields`.
type: boolean
default: false
estimates:
diff --git a/site/content/3.12/develop/http-api/indexes/persistent.md b/site/content/3.12/develop/http-api/indexes/persistent.md
index 7017836a3b..cb916d7aed 100644
--- a/site/content/3.12/develop/http-api/indexes/persistent.md
+++ b/site/content/3.12/develop/http-api/indexes/persistent.md
@@ -113,8 +113,9 @@ paths:
default: false
sparse:
description: |
- Whether create a sparse index that excludes documents with at least
- one of the `fields` missing or set to `null`.
+ Whether to create a sparse index that excludes documents with
+ at least one of the attributes for indexing missing or set to
+ `null`. These attributes are defined by `fields`.
type: boolean
default: false
deduplicate:
diff --git a/site/content/3.12/develop/http-api/indexes/vector.md b/site/content/3.12/develop/http-api/indexes/vector.md
index 98b973d3d6..c0ad039871 100644
--- a/site/content/3.12/develop/http-api/indexes/vector.md
+++ b/site/content/3.12/develop/http-api/indexes/vector.md
@@ -57,7 +57,7 @@ paths:
A list with exactly one attribute path to specify
where the vector embedding is stored in each document. The vector data needs
to be populated before creating the index.
-
+
If you want to index another vector embedding attribute, you need to create a
separate vector index.
type: array
@@ -65,6 +65,13 @@ paths:
maxItems: 1
items:
type: string
+ sparse:
+ description: |
+ Whether to create a sparse index that excludes documents with
+ the attribute for indexing missing or set to `null`. This
+ attribute is defined by `fields`.
+ type: boolean
+ default: false
parallelism:
description: |
The number of threads to use for indexing.
diff --git a/site/content/3.12/index-and-search/indexing/index-utilization.md b/site/content/3.12/index-and-search/indexing/index-utilization.md
index c6d239b7ef..46a4592bb0 100644
--- a/site/content/3.12/index-and-search/indexing/index-utilization.md
+++ b/site/content/3.12/index-and-search/indexing/index-utilization.md
@@ -19,8 +19,9 @@ It is often beneficial to create an index on more than just one attribute. By ad
to an index, an index can become more selective and thus reduce the number of documents that
queries need to process.
-ArangoDB's primary indexes, edges indexes and hash indexes will automatically provide selectivity
-estimates. Index selectivity estimates are provided in the web interface, the `indexes()` return
+ArangoDB's `primary` and `edge` indexes automatically provide selectivity estimates.
+The `persistent`, `mdi`, and `mdi-prefixed` indexes do too, by default.
+Index selectivity estimates are provided in the web interface, the `indexes()` return
value and in the `explain()` output for a given query.
The more selective an index is, the more documents it will filter on average. The index selectivity
diff --git a/site/content/3.12/index-and-search/indexing/which-index-to-use-when.md b/site/content/3.12/index-and-search/indexing/which-index-to-use-when.md
index 97f3d8206f..e045597a39 100644
--- a/site/content/3.12/index-and-search/indexing/which-index-to-use-when.md
+++ b/site/content/3.12/index-and-search/indexing/which-index-to-use-when.md
@@ -175,11 +175,11 @@ db.collection.ensureIndex({ type: "persistent", fields: [ "attributeName1", "att
When not explicitly set, the `sparse` attribute defaults to `false` for new indexes.
Indexes other than persistent do not support the `sparse` option.
-As sparse indexes may exclude some documents from the collection, they cannot be used for
-all types of queries. Sparse hash indexes cannot be used to find documents for which at
-least one of the indexed attributes has a value of `null`. For example, the following AQL
-query cannot use a sparse index, even if one was created on attribute `attr`:
-
+As sparse indexes may exclude some documents from the collection, they cannot
+be used for all types of queries. For example, sparse persistent indexes cannot
+be used to find documents for which at least one of the indexed attributes
+is missing or has a value of `null`. For example, the following AQL
+query cannot use a sparse index over the attribute `attr`:
```aql
FOR doc In collection
@@ -189,15 +189,25 @@ FOR doc In collection
If the lookup value is non-constant, a sparse index may or may not be used, depending on
the other types of conditions in the query. If the optimizer can safely determine that
-the lookup value cannot be `null`, a sparse index may be used. When uncertain, the optimizer
-does not make use of a sparse index in a query in order to produce correct results.
+the lookup value cannot be `null`, a sparse index may be used.
+
+```aql
+FOR doc In collection
+ LET random = RAND() * 5
+ FILTER doc.attr < random // Includes numbers < random but also true, false, and null!
+ FILTER doc.attr != null // Explicitly exclude null to make a sparse index eligible
+ RETURN doc
+```
+
+When uncertain, the optimizer does not make use of a sparse index in a query in
+order to produce correct results.
For example, the following queries cannot use a sparse index on `attr` because the optimizer
does not know beforehand whether the values which are compared to `doc.attr` include `null`:
```aql
FOR doc In collection
- FILTER doc.attr == SOME_FUNCTION(...)
+ FILTER doc.attr == SOME_FUNCTION(...)
RETURN doc
```
diff --git a/site/content/3.12/index-and-search/indexing/working-with-indexes/vector-indexes.md b/site/content/3.12/index-and-search/indexing/working-with-indexes/vector-indexes.md
index 17d9be8fe3..e73f6c2971 100644
--- a/site/content/3.12/index-and-search/indexing/working-with-indexes/vector-indexes.md
+++ b/site/content/3.12/index-and-search/indexing/working-with-indexes/vector-indexes.md
@@ -65,15 +65,18 @@ centroids and the quality of vector search thus degrades.
- **fields** (array of strings): A list with a single attribute path to specify
where the vector embedding is stored in each document. The vector data needs
to be populated before creating the index.
-
+
If you want to index another vector embedding attribute, you need to create a
separate vector index.
+- **sparse** (boolean): Whether to create a sparse index that excludes documents
+ with the attribute for indexing missing or set to `null`. This attribute is
+ defined by `fields`. Default: `false`.
- **parallelism** (number):
- The number of threads to use for indexing. The default is `2`.
+ The number of threads to use for indexing. Default: `2`.
- **inBackground** (boolean):
Set this option to `true` to keep the collection/shards available for
write operations by not using an exclusive write lock for the duration
- of the index creation. The default is `false`.
+ of the index creation. Default: `false`.
- **params**: The parameters as used by the Faiss library.
- **metric** (string): Whether to use `cosine` or `l2` (Euclidean) distance calculation.
- **dimension** (number): The vector dimension. The attribute to index needs to
@@ -89,11 +92,11 @@ centroids and the quality of vector search thus degrades.
number of documents.
- **defaultNProbe** (number, _optional_): How many neighboring centroids to
consider for the search results by default. The larger the number, the slower
- the search but the better the search results. The default is `1`. You should
+ the search but the better the search results. Default: `1`. You should
generally use a higher value here or per query via the `nProbe` option of
the vector similarity functions.
- **trainingIterations** (number, _optional_): The number of iterations in the
- training process. The default is `25`. Smaller values lead to a faster index
+ training process. Default: `25`. Smaller values lead to a faster index
creation but may yield worse search results.
- **factory** (string, _optional_): You can specify an index factory string that is
forwarded to the underlying Faiss library, allowing you to combine different
diff --git a/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md b/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md
index 3a195f857d..b26505daad 100644
--- a/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md
+++ b/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md
@@ -1443,6 +1443,13 @@ utilizing vector indexes in queries.
Furthermore, a new error code `ERROR_QUERY_VECTOR_SEARCH_NOT_APPLIED` (1554)
has been added.
+---
+
+Introduced in: v3.12.6
+
+Vector indexes can now be sparse to exclude documents with the embedding attribute
+for indexing missing or set to `null`.
+
## Server options
### Effective and available startup options
diff --git a/site/content/3.13/develop/foxx-microservices/guides/authentication-and-sessions.md b/site/content/3.13/develop/foxx-microservices/guides/authentication-and-sessions.md
index 2307728d20..9b77d01f3e 100644
--- a/site/content/3.13/develop/foxx-microservices/guides/authentication-and-sessions.md
+++ b/site/content/3.13/develop/foxx-microservices/guides/authentication-and-sessions.md
@@ -23,7 +23,7 @@ authentication.
In this example we'll use two collections: a `users` collection to store the
user objects with names and credentials, and a `sessions` collection to store
the session data. We'll also make sure usernames are unique
-by adding a hash index:
+by adding a `persistent` index:
```js
"use strict";
@@ -37,7 +37,7 @@ if (!db._collection(sessions)) {
db._createDocumentCollection(sessions);
}
module.context.collection("users").ensureIndex({
- type: "hash",
+ type: "persistent",
unique: true,
fields: ["username"]
});
diff --git a/site/content/3.13/develop/http-api/indexes/_index.md b/site/content/3.13/develop/http-api/indexes/_index.md
index f759f7e174..a35384f3d0 100644
--- a/site/content/3.13/develop/http-api/indexes/_index.md
+++ b/site/content/3.13/develop/http-api/indexes/_index.md
@@ -221,8 +221,8 @@ paths:
insert a value into the index that already exists in the index always fails,
regardless of the value of this attribute.
- The optional **estimates** attribute is supported by persistent indexes.
- This attribute controls whether index selectivity estimates are
+ The optional **estimates** attribute is supported by `persistent`, `mdi`, and
+ `mdi-prefixed` indexes. This attribute controls whether index selectivity estimates are
maintained for the index. Not maintaining index selectivity estimates can have
a slightly positive impact on write performance.
The downside of turning off index selectivity estimates will be that
diff --git a/site/content/3.13/develop/http-api/indexes/multi-dimensional.md b/site/content/3.13/develop/http-api/indexes/multi-dimensional.md
index e38f031495..5627af291b 100644
--- a/site/content/3.13/develop/http-api/indexes/multi-dimensional.md
+++ b/site/content/3.13/develop/http-api/indexes/multi-dimensional.md
@@ -111,7 +111,10 @@ paths:
default: false
sparse:
description: |
- If `true`, then create a sparse index.
+ Whether to create a sparse index that excludes documents with
+ at least one of the attributes for indexing missing or set to
+ `null`. These attributes are defined by `fields` and (for
+ `mdi-prefixed` indexes) by `prefixFields`.
type: boolean
default: false
estimates:
diff --git a/site/content/3.13/develop/http-api/indexes/persistent.md b/site/content/3.13/develop/http-api/indexes/persistent.md
index 7017836a3b..cb916d7aed 100644
--- a/site/content/3.13/develop/http-api/indexes/persistent.md
+++ b/site/content/3.13/develop/http-api/indexes/persistent.md
@@ -113,8 +113,9 @@ paths:
default: false
sparse:
description: |
- Whether create a sparse index that excludes documents with at least
- one of the `fields` missing or set to `null`.
+ Whether to create a sparse index that excludes documents with
+ at least one of the attributes for indexing missing or set to
+ `null`. These attributes are defined by `fields`.
type: boolean
default: false
deduplicate:
diff --git a/site/content/3.13/develop/http-api/indexes/vector.md b/site/content/3.13/develop/http-api/indexes/vector.md
index 98b973d3d6..c0ad039871 100644
--- a/site/content/3.13/develop/http-api/indexes/vector.md
+++ b/site/content/3.13/develop/http-api/indexes/vector.md
@@ -57,7 +57,7 @@ paths:
A list with exactly one attribute path to specify
where the vector embedding is stored in each document. The vector data needs
to be populated before creating the index.
-
+
If you want to index another vector embedding attribute, you need to create a
separate vector index.
type: array
@@ -65,6 +65,13 @@ paths:
maxItems: 1
items:
type: string
+ sparse:
+ description: |
+ Whether to create a sparse index that excludes documents with
+ the attribute for indexing missing or set to `null`. This
+ attribute is defined by `fields`.
+ type: boolean
+ default: false
parallelism:
description: |
The number of threads to use for indexing.
diff --git a/site/content/3.13/index-and-search/indexing/index-utilization.md b/site/content/3.13/index-and-search/indexing/index-utilization.md
index c6d239b7ef..46a4592bb0 100644
--- a/site/content/3.13/index-and-search/indexing/index-utilization.md
+++ b/site/content/3.13/index-and-search/indexing/index-utilization.md
@@ -19,8 +19,9 @@ It is often beneficial to create an index on more than just one attribute. By ad
to an index, an index can become more selective and thus reduce the number of documents that
queries need to process.
-ArangoDB's primary indexes, edges indexes and hash indexes will automatically provide selectivity
-estimates. Index selectivity estimates are provided in the web interface, the `indexes()` return
+ArangoDB's `primary` and `edge` indexes automatically provide selectivity estimates.
+The `persistent`, `mdi`, and `mdi-prefixed` indexes do too, by default.
+Index selectivity estimates are provided in the web interface, the `indexes()` return
value and in the `explain()` output for a given query.
The more selective an index is, the more documents it will filter on average. The index selectivity
diff --git a/site/content/3.13/index-and-search/indexing/which-index-to-use-when.md b/site/content/3.13/index-and-search/indexing/which-index-to-use-when.md
index 97f3d8206f..e045597a39 100644
--- a/site/content/3.13/index-and-search/indexing/which-index-to-use-when.md
+++ b/site/content/3.13/index-and-search/indexing/which-index-to-use-when.md
@@ -175,11 +175,11 @@ db.collection.ensureIndex({ type: "persistent", fields: [ "attributeName1", "att
When not explicitly set, the `sparse` attribute defaults to `false` for new indexes.
Indexes other than persistent do not support the `sparse` option.
-As sparse indexes may exclude some documents from the collection, they cannot be used for
-all types of queries. Sparse hash indexes cannot be used to find documents for which at
-least one of the indexed attributes has a value of `null`. For example, the following AQL
-query cannot use a sparse index, even if one was created on attribute `attr`:
-
+As sparse indexes may exclude some documents from the collection, they cannot
+be used for all types of queries. For example, sparse persistent indexes cannot
+be used to find documents for which at least one of the indexed attributes
+is missing or has a value of `null`. For example, the following AQL
+query cannot use a sparse index over the attribute `attr`:
```aql
FOR doc In collection
@@ -189,15 +189,25 @@ FOR doc In collection
If the lookup value is non-constant, a sparse index may or may not be used, depending on
the other types of conditions in the query. If the optimizer can safely determine that
-the lookup value cannot be `null`, a sparse index may be used. When uncertain, the optimizer
-does not make use of a sparse index in a query in order to produce correct results.
+the lookup value cannot be `null`, a sparse index may be used.
+
+```aql
+FOR doc In collection
+ LET random = RAND() * 5
+ FILTER doc.attr < random // Includes numbers < random but also true, false, and null!
+ FILTER doc.attr != null // Explicitly exclude null to make a sparse index eligible
+ RETURN doc
+```
+
+When uncertain, the optimizer does not make use of a sparse index in a query in
+order to produce correct results.
For example, the following queries cannot use a sparse index on `attr` because the optimizer
does not know beforehand whether the values which are compared to `doc.attr` include `null`:
```aql
FOR doc In collection
- FILTER doc.attr == SOME_FUNCTION(...)
+ FILTER doc.attr == SOME_FUNCTION(...)
RETURN doc
```
diff --git a/site/content/3.13/index-and-search/indexing/working-with-indexes/vector-indexes.md b/site/content/3.13/index-and-search/indexing/working-with-indexes/vector-indexes.md
index 236093878b..77f97bcd54 100644
--- a/site/content/3.13/index-and-search/indexing/working-with-indexes/vector-indexes.md
+++ b/site/content/3.13/index-and-search/indexing/working-with-indexes/vector-indexes.md
@@ -65,15 +65,18 @@ centroids and the quality of vector search thus degrades.
- **fields** (array of strings): A list with a single attribute path to specify
where the vector embedding is stored in each document. The vector data needs
to be populated before creating the index.
-
+
If you want to index another vector embedding attribute, you need to create a
separate vector index.
+- **sparse** (boolean): Whether to create a sparse index that excludes documents
+ with the attribute for indexing missing or set to `null`. This attribute is
+ defined by `fields`. Default: `false`.
- **parallelism** (number):
- The number of threads to use for indexing. The default is `2`.
+ The number of threads to use for indexing. Default: `2`.
- **inBackground** (boolean):
Set this option to `true` to keep the collection/shards available for
write operations by not using an exclusive write lock for the duration
- of the index creation. The default is `false`.
+ of the index creation. Default: `false`.
- **params**: The parameters as used by the Faiss library.
- **metric** (string): Whether to use `cosine` or `l2` (Euclidean) distance calculation.
- **dimension** (number): The vector dimension. The attribute to index needs to
@@ -89,11 +92,11 @@ centroids and the quality of vector search thus degrades.
number of documents.
- **defaultNProbe** (number, _optional_): How many neighboring centroids to
consider for the search results by default. The larger the number, the slower
- the search but the better the search results. The default is `1`. You should
+ the search but the better the search results. Default: `1`. You should
generally use a higher value here or per query via the `nProbe` option of
the vector similarity functions.
- **trainingIterations** (number, _optional_): The number of iterations in the
- training process. The default is `25`. Smaller values lead to a faster index
+ training process. Default: `25`. Smaller values lead to a faster index
creation but may yield worse search results.
- **factory** (string, _optional_): You can specify an index factory string that is
forwarded to the underlying Faiss library, allowing you to combine different
diff --git a/site/content/3.13/release-notes/version-3.12/whats-new-in-3-12.md b/site/content/3.13/release-notes/version-3.12/whats-new-in-3-12.md
index 3a195f857d..b26505daad 100644
--- a/site/content/3.13/release-notes/version-3.12/whats-new-in-3-12.md
+++ b/site/content/3.13/release-notes/version-3.12/whats-new-in-3-12.md
@@ -1443,6 +1443,13 @@ utilizing vector indexes in queries.
Furthermore, a new error code `ERROR_QUERY_VECTOR_SEARCH_NOT_APPLIED` (1554)
has been added.
+---
+
+Introduced in: v3.12.6
+
+Vector indexes can now be sparse to exclude documents with the embedding attribute
+for indexing missing or set to `null`.
+
## Server options
### Effective and available startup options