From 4bd10cd4de52a3e66e4cef85c39645fb9879d739 Mon Sep 17 00:00:00 2001 From: matea16 Date: Fri, 11 Jul 2025 09:24:59 +0200 Subject: [PATCH 01/13] init --- pages/release-notes.mdx | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/pages/release-notes.mdx b/pages/release-notes.mdx index 2e9da7507..726503bad 100644 --- a/pages/release-notes.mdx +++ b/pages/release-notes.mdx @@ -66,6 +66,14 @@ updated. ## πŸš€ Latest release +### Memgraph v3.5.0 - August 27th, 2025 + +### MAGE v3.5.0 - August 27th, 2025 + +### Lab v3.5.0 - August 27th, 2025 + +## Previous releases + ### Memgraph v3.4.0 - July 10th, 2025 {

⚠️ Breaking changes

} @@ -148,8 +156,6 @@ updated. one format to another temporal string format. [#631](https://github.com/memgraph/mage/pull/631) -## Previous releases - ### Memgraph v3.3.0 - June 4th, 2025 {

✨ New features

} From d93cad0deacd645d0ab2bc962261b0031fa66bf2 Mon Sep 17 00:00:00 2001 From: Marko Budiselic Date: Sun, 17 Aug 2025 14:16:18 -0700 Subject: [PATCH 02/13] Add the first release notes item --- pages/release-notes.mdx | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/pages/release-notes.mdx b/pages/release-notes.mdx index e145daa64..780555b0d 100644 --- a/pages/release-notes.mdx +++ b/pages/release-notes.mdx @@ -68,6 +68,12 @@ updated. ### Memgraph v3.5.0 - August 27th, 2025 +{

✨ New features

} + +- Added the ability to create a text index on a subset of properties using the + following syntax: `CREATE TEXT INDEX index_name ON :Label(prop1, prop2, + prop3);` [#3155](https://github.com/memgraph/memgraph/pull/3155) + ### MAGE v3.5.0 - August 27th, 2025 ### Lab v3.5.0 - August 27th, 2025 From cee09f6aa4809e4a9edd6ae0f6cf58bddafded97 Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Thu, 21 Aug 2025 08:56:26 +0200 Subject: [PATCH 03/13] Add docs for STRICT_SYNC replication mode (#1351) * Add docs to 2PC * Update types of cluster --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/clustering/high-availability.mdx | 187 +++++++++++++------------ pages/clustering/replication.mdx | 48 +++++-- 2 files changed, 135 insertions(+), 100 deletions(-) diff --git a/pages/clustering/high-availability.mdx b/pages/clustering/high-availability.mdx index 3629f2c7a..7cfb9da23 100644 --- a/pages/clustering/high-availability.mdx +++ b/pages/clustering/high-availability.mdx @@ -12,12 +12,12 @@ import {CommunityLinks} from '/components/social-card/CommunityLinks' A cluster is considered highly available if, at any point, there is some instance that can respond to a user query. Our high availability relies on replication. The cluster consists of: -- The MAIN instance on which the user can execute write queries -- REPLICA instances that can only respond to read queries +- The main instance on which the user can execute write queries +- replica instances that can only respond to read queries - COORDINATOR instances that manage the cluster state. Depending on how configuration flags are set, Memgraph can run as a data instance or coordinator instance. -The coordinator instance is a new addition to enable the high availability feature and orchestrates data instances to ensure that there is always one MAIN instance in the cluster. +The coordinator instance is a new addition to enable the high availability feature and orchestrates data instances to ensure that there is always one main instance in the cluster. ## Cluster management @@ -25,10 +25,10 @@ For achieving high availability, Memgraph uses Raft consensus protocol, which is a significant advantage that it is much easier to understand. It's important to say that Raft isn't a Byzantine fault-tolerant algorithm. You can learn more about Raft in the paper [In Search of an Understandable Consensus Algorithm](https://raft.github.io/raft.pdf). -Typical Memgraph's highly available cluster consists of 3 data instances (1 MAIN and 2 REPLICAS) and 3 coordinator instances backed up by Raft protocol. +Typical Memgraph's highly available cluster consists of 3 data instances (1 main and 2 replicaS) and 3 coordinator instances backed up by Raft protocol. Users can create more than 3 coordinators, but the replication factor (RF) of 3 is a de facto standard in distributed databases. -One coordinator instance is the leader whose job is to always ensure one writeable data instance (MAIN). The other two coordinator instances replicate +One coordinator instance is the leader whose job is to always ensure one writeable data instance (main). The other two coordinator instances replicate changes the leader coordinator did in its own Raft log. Operations saved into the Raft log are those that are related to cluster management. Memgraph doesn't have its implementation of the Raft protocol. For this task, Memgraph uses an industry-proven library [NuRaft](https://github.com/eBay/NuRaft). @@ -37,7 +37,7 @@ You can start the coordinator instance by specifying `--coordinator-id`, queries related to high availability, so you cannot execute any data-oriented query on it. The coordinator port is used for the Raft protocol, which all coordinators use to ensure the consistency of the cluster's state. Data instances are distinguished from coordinator instances by specifying only `--management-port` flag. This port is used for RPC network communication between the coordinator and data -instances. When started by default, the data instance is MAIN. The coordinator will ensure that no data inconsistency can happen during and after the instance's +instances. When started by default, the data instance is main. The coordinator will ensure that no data inconsistency can happen during and after the instance's restart. Once all instances are started, the user can start adding data instances to the cluster. @@ -70,19 +70,19 @@ but from the availability perspective, it is better to separate them physically. ## Bolt+routing -Directly connecting to the MAIN instance isn't preferred in the HA cluster since the MAIN instance changes due to various failures. Because of that, users +Directly connecting to the main instance isn't preferred in the HA cluster since the main instance changes due to various failures. Because of that, users can use bolt+routing so that write queries can always be sent to the correct data instance. This will prevent a split-brain issue since clients, when writing, won't be routed to the old main but rather to the new main instance on which failover got performed. This protocol works in a way that the client first sends a ROUTE bolt message to any coordinator instance. The coordinator replies to the message by returning the routing table with three entries specifying -from which instance can the data be read, to which instance data can be written to and which instances behave as routers. In the Memgraph HA cluster, the MAIN -data instance is the only writeable instance, REPLICAs are readable instances, and COORDINATORs behave as routers. However, the cluster can be configured in such a way -that MAIN can also be used for reading. Check this [paragraph](#setting-config-for-highly-available-cluster) for more info. +from which instance can the data be read, to which instance data can be written to and which instances behave as routers. In the Memgraph HA cluster, the main +data instance is the only writeable instance, replicas are readable instances, and COORDINATORs behave as routers. However, the cluster can be configured in such a way +that main can also be used for reading. Check this [paragraph](#setting-config-for-highly-available-cluster) for more info. Bolt+routing is the client-side routing protocol, meaning network endpoint resolution happens inside drivers. For more details about the Bolt messages involved in the communication, check [the following link](https://neo4j.com/docs/bolt/current/bolt/message/#messages-route). Users only need to change the scheme they use for connecting to coordinators. This means instead of using `bolt://,` you should -use `neo4j://` to get an active connection to the current MAIN instance in the cluster. You can find examples of how to +use `neo4j://` to get an active connection to the current main instance in the cluster. You can find examples of how to use bolt+routing in different programming languages [here](https://github.com/memgraph/memgraph/tree/master/tests/drivers). It is important to note that setting up the cluster on one coordinator (registration of data instances and coordinators, setting main) must be done using bolt connection @@ -217,17 +217,18 @@ Registering instances should be done on a single coordinator. The chosen coordin Register instance query will result in several actions: 1. The coordinator instance will connect to the data instance on the `management_server` network address. 2. The coordinator instance will start pinging the data instance every `--instance-health-check-frequency-sec` seconds to check its status. -3. Data instance will be demoted from MAIN to REPLICA. +3. Data instance will be demoted from main to replica. 4. Data instance will start the replication server on `replication_server`. ```plaintext -REGISTER INSTANCE instanceName ( AS ASYNC ) WITH CONFIG {"bolt_server": boltServer, "management_server": managementServer, "replication_server": replicationServer}; +REGISTER INSTANCE instanceName ( AS ASYNC | AS STRICT_SYNC ) ? WITH CONFIG {"bolt_server": boltServer, "management_server": managementServer, "replication_server": replicationServer}; ``` This operation will result in writing to the Raft log. -In case the MAIN instance already exists in the cluster, a replica instance will be automatically connected to the MAIN. You can specify whether the replica should behave -synchronously or asynchronously by using `AS ASYNC` construct after `instanceName`. +In case the main instance already exists in the cluster, a replica instance will be automatically connected to the main. Constructs ( AS ASYNC | AS STRICT_SYNC ) serve to specify +instance's replication mode when the instance behaves as replica. You can only have `STRICT_SYNC` and `ASYNC` or `SYNC` and `ASYNC` replicas together in the cluster. Combining `STRICT_SYNC` +and `SYNC` replicas together doesn't have proper semantic meaning so it is forbidden. ### Add coordinator instance @@ -262,23 +263,23 @@ REMOVE COORDINATOR ; ``` -### Set instance to MAIN +### Set instance to main -Once all data instances are registered, one data instance should be promoted to MAIN. This can be achieved by using the following query: +Once all data instances are registered, one data instance should be promoted to main. This can be achieved by using the following query: ```plaintext -SET INSTANCE instanceName to MAIN; +SET INSTANCE instanceName to main; ``` -This query will register all other instances as REPLICAs to the new MAIN. If one of the instances is unavailable, setting the instance to MAIN will not succeed. -If there is already a MAIN instance in the cluster, this query will fail. +This query will register all other instances as replicas to the new main. If one of the instances is unavailable, setting the instance to main will not succeed. +If there is already a main instance in the cluster, this query will fail. This operation will result in writing to the Raft log. ### Demote instance Demote instance query can be used by an admin to demote the current main to replica. In this case, the leader coordinator won't perform a failover, but as a user, -you should choose promote one of the data instances to MAIN using the `SET INSTANCE `instance` TO MAIN` query. +you should choose promote one of the data instances to main using the `SET INSTANCE `instance` TO main` query. ```plaintext DEMOTE INSTANCE instanceName; @@ -288,14 +289,13 @@ This operation will result in writing to the Raft log. -By combining the functionalities of queries `DEMOTE INSTANCE instanceName` and `SET INSTANCE instanceName TO MAIN` you get the manual failover capability. This can be useful -e.g during a maintenance work on the instance where the current MAIN is deployed. +By combining the functionalities of queries `DEMOTE INSTANCE instanceName` and `SET INSTANCE instanceName TO main` you get the manual failover capability. This can be useful +e.g during a maintenance work on the instance where the current main is deployed. - ### Unregister instance There are various reasons which could lead to the decision that an instance needs to be removed from the cluster. The hardware can be broken, @@ -306,21 +306,21 @@ UNREGISTER INSTANCE instanceName; ``` When unregistering an instance, ensure that the instance being unregistered is -**not** the MAIN instance. Unregistering MAIN can lead to an inconsistent -cluster state. Additionally, the cluster must have an **alive** MAIN instance -during the unregistration process. If no MAIN instance is available, the +**not** the main instance. Unregistering main can lead to an inconsistent +cluster state. Additionally, the cluster must have an **alive** main instance +during the unregistration process. If no main instance is available, the operation cannot be guaranteed to succeed. -The instance requested to be unregistered will also be unregistered from the current MAIN's REPLICA set. +The instance requested to be unregistered will also be unregistered from the current main's replica set. ### Force reset cluster state In case the cluster gets stuck there is an option to do the force reset of the cluster. You need to execute a command on the leader coordinator. This command will result in the following actions: -1. The coordinator instance will demote each alive instance to REPLICA. -2. From the alive instance it will choose a new MAIN instance. -3. Instances that are down will be demoted to REPLICAs once they come back up. +1. The coordinator instance will demote each alive instance to replica. +2. From the alive instance it will choose a new main instance. +3. Instances that are down will be demoted to replicas once they come back up. ```plaintext FORCE RESET CLUSTER STATE; @@ -334,7 +334,7 @@ You can check the state of the whole cluster using the `SHOW INSTANCES` query. T each server you can see the following information: 1. Network endpoints they are using for managing cluster state 2. Health state of server - 3. Role - MAIN, REPLICA, LEADER, FOLLOWER or unknown if not alive + 3. Role - main, replica, LEADER, FOLLOWER or unknown if not alive 4. The time passed since the last response time to the leader's health ping This query can be run on either the leader or followers. Since only the leader knows the exact status of the health state and last response time, @@ -439,13 +439,14 @@ for which the timeout is used is the following: - TimestampReq -> main sending to replica - SystemHeartbeatReq -> main sending to replica - ForceResetStorageReq -> main sending to replica. The timeout is set to 60s. -- SystemRecoveryReq -> main sending to replica. The timeout set to 5s. +- SystemRecoveryReq -> main sending to replica. The timeout is set to 5s. +- FinalizeCommitReq -> main sending to replica. The timeout is set to 10s. -For replication-related RPC messages β€” AppendDeltasRpc, CurrentWalRpc, and +For RPC messages which are sending the variable number of storage deltas β€” PrepareCommitRpc, CurrentWalRpc, and WalFilesRpc β€” it is not practical to set a strict execution timeout. The -processing time on the replica side is directly proportional to the amount of -data being transferred. To handle this, the replica sends periodic progress +processing time on the replica side is directly proportional to the number of +deltas being transferred. To handle this, the replica sends periodic progress updates to the main instance after processing every 100,000 deltas. Since processing 100,000 deltas is expected to take a relatively consistent amount of time, we can enforce a timeout based on this interval. The default timeout for @@ -453,7 +454,7 @@ these RPC messages is 30 seconds, though in practice, processing 100,000 deltas typically takes less than 3 seconds. SnapshotRpc is also a replication-related RPC message, but its execution time -is tracked differently. The replica sends an update to the main instance after +is tracked a bit differently from RPC messages shipping deltas. The replica sends an update to the main instance after completing 1,000,000 units of work. The work units are assigned as follows: - Processing nodes, edges, or indexed entities (label index, label-property index, @@ -483,93 +484,93 @@ a multiplier of `--instance-health-check-frequency-sec`. Set the multiplier coef For example, set `--instance-down-timeout-sec=5` and `--instance-health-check-frequency-sec=1` which will result in coordinator contacting each instance every second and the instance is considered dead after it doesn't respond 5 times (5 seconds / 1 second). -In case a REPLICA doesn't respond to a health check, the leader coordinator will try to contact it again every `--instance-health-check-frequency-sec`. -When the REPLICA instance rejoins the cluster (comes back up), it always rejoins as REPLICA. For MAIN instance, there are two options. -If it is down for less than `--instance-down-timeout-sec`, it will rejoin as MAIN because it is still considered alive. If it is down for more than `--instance-down-timeout-sec`, -the failover procedure is initiated. Whether MAIN will rejoin as MAIN depends on the success of the failover procedure. If the failover procedure succeeds, now old MAIN -will rejoin as REPLICA. If failover doesn't succeed, MAIN will rejoin as MAIN once it comes back up. +In case a replica doesn't respond to a health check, the leader coordinator will try to contact it again every `--instance-health-check-frequency-sec`. +When the replica instance rejoins the cluster (comes back up), it always rejoins as replica. For main instance, there are two options. +If it is down for less than `--instance-down-timeout-sec`, it will rejoin as main because it is still considered alive. If it is down for more than `--instance-down-timeout-sec`, +the failover procedure is initiated. Whether main will rejoin as main depends on the success of the failover procedure. If the failover procedure succeeds, now old main +will rejoin as replica. If failover doesn't succeed, main will rejoin as main once it comes back up. ### Failover procedure - high level description -From alive REPLICAs coordinator chooses a new potential MAIN. -This instance is only potentially new MAIN as the failover procedure can still fail due to various factors (networking issues, promote to MAIN fails, any alive REPLICA failing to -accept an RPC message, etc). The coordinator sends an RPC request to the potential new MAIN, which is still in REPLICA state, to promote itself to the MAIN instance with info -about other REPLICAs to which it will replicate data. Once that request succeeds, the new MAIN can start replication to the other instances and accept write queries. - +From alive replicas coordinator chooses a new potential main and writes a log to the Raft storage about the new main. On the next leader's ping to the instance, +it will send to the instance an RPC request to the new main, which is still in replica state, to promote itself to the main instance with info +about other replicas to which it will replicate data. Once that request succeeds, the new main can start replication to the other instances and accept write queries. -### Choosing new MAIN from available REPLICAs +### Choosing new main from available replicas -When failover is happening, some REPLICAs can also be down. From the list of alive REPLICAs, a new MAIN is chosen. First, the leader coordinator contacts each alive REPLICA +When failover is happening, some replicas can also be down. From the list of alive replicas, a new main is chosen. First, the leader coordinator contacts each alive replica to get info about each database's last commit timestamp. In the case of enabled multi-tenancy, from each instance coordinator will get info on all databases and their last commit -timestamp. Currently, the coordinator chooses an instance to become a new MAIN by comparing the latest commit timestamps of all databases. The instance which is newest on most -databases is considered the best candidate for the new MAIN. If there are multiple instances which have the same number of newest databases, we sum timestamps of all databases +timestamp. Currently, the coordinator chooses an instance to become a new main by comparing the latest commit timestamps of all databases. The instance which is newest on most +databases is considered the best candidate for the new main. If there are multiple instances which have the same number of newest databases, we sum timestamps of all databases and consider instance with a larger sum as the better candidate. -### Old MAIN rejoining to the cluster +### Old main rejoining to the cluster -Once the old MAIN gets back up, the coordinator sends an RPC request to demote the old MAIN to REPLICA. The coordinator tracks at all times which instance was the last MAIN. +Once the old main gets back up, the coordinator sends an RPC request to demote the old main to replica. The coordinator tracks at all times which instance was the last main. -The leader coordinator sends two RPC requests in the given order to demote old MAIN to REPLICA: -1. Demote MAIN to REPLICA RPC request -2. A request to store the UUID of the current MAIN, which the old MAIN, now acting as a REPLICA instance, must listen to. +The leader coordinator sends two RPC requests in the given order to demote old main to replica: +1. Demote main to replica RPC request +2. A request to store the UUID of the current main, which the old main, now acting as a replica instance, must listen to. -### How REPLICA knows which MAIN to listen +### How replica knows which main to listen -Each REPLICA has a UUID of MAIN it listens to. If a network partition happens where MAIN can talk to a REPLICA but the coordinator can't talk to the MAIN, from the coordinator's -point of view that MAIN is down. From REPLICA's point of view, the MAIN instance is still alive. The coordinator will start the failover procedure, and we can end up with multiple MAINs -where REPLICAs can listen to both MAINs. To prevent such an issue, each REPLICA gets a new UUID that no current MAIN has. The coordinator generates the new UUID, -which the new MAIN will get to use on its promotion to MAIN. +Each replica has a UUID of main it listens to. If a network partition happens where main can talk to a replica but the coordinator can't talk to the main, from the coordinator's +point of view that main is down. From replica's point of view, the main instance is still alive. The coordinator will start the failover procedure, and we can end up with multiple mains +where replicas can listen to both mains. To prevent such an issue, each replica gets a new UUID that no current main has. The coordinator generates the new UUID, +which the new main will get to use on its promotion to main. -If REPLICA was down at one point, MAIN could have changed. When REPLICA gets back up, it doesn't listen to any MAIN until the coordinator sends an RPC request to REPLICA to start -listening to MAIN with the given UUID. +If replica was down at one point, main could have changed. When replica gets back up, it doesn't listen to any main until the coordinator sends an RPC request to replica to start +listening to main with the given UUID. ### Replication concerns #### Force sync of data During a failover event, Memgraph selects the most up-to-date, alive instance to -become the new MAIN. The selection process works as follows: -1. From the list of available REPLICA instances, Memgraph chooses the one with +become the new main. The selection process works as follows: +1. From the list of available replica instances, Memgraph chooses the one with the latest commit timestamp for the default database. 2. If an instance that had more recent data was down during this selection -process, it will not be considered for promotion to MAIN. +process, it will not be considered for promotion to main. If a previously down instance had more up-to-date data but was unavailable during failover, it will go through a specific recovery process upon rejoining the cluster: -- The new MAIN will clear the returning replica’s storage. -- The returning replica will then receive all commits from the new MAIN to +- The replica will reset its storage. +- The replica will receive all commits from the new main to synchronize its state. - The replica's old durability files will be preserved in a `.old` directory in `data_directory/snapshots` and `data_directory/wal` folders, allowing admins to manually recover data if needed. -Memgraph prioritizes availability over strict consistency (leaning towards AP in -the CAP theorem). While it aims to maintain consistency as much as possible, the -current failover logic can result in a non-zero Recovery Point Objective (RPO), -that is, the loss of committed data, because: -- The promoted MAIN might not have received all commits from the previous MAIN +Depending on the replication mode used, there are different levels of data loss +that can happen upon the failover. With the default `SYNC` replication mode, +Memgraph prioritizes availability over strict consistency and can result in +a non-zero Recovery Point Objective (RPO), that is, the loss of committed data, because: +- The promoted main might not have received all commits from the previous main before the failure. -- This design ensures that the MAIN remains writable for the maximum possible +- This design ensures that the main remains writable for the maximum possible time. -If your environment requires strong consistency and can tolerate write -unavailability, [reach out to -us](https://github.com/memgraph/memgraph/discussions). We are actively exploring -support for a fully synchronous mode. +With `ASYNC` replication mode, you also risk losing some data upon the failover because +main can freely continue commiting no matter the status of ASYNC replicas. + +The `STRICT_SYNC` replication mode allows users experiencing a failover without any data loss +in all situations. It comes with reduced throughput because of the cost of running two-phase commit protocol. ## Actions on follower coordinators -From follower coordinators you can only execute `SHOW INSTANCES`. Registration of data instance, unregistration of data instances, demoting instance, setting instance to MAIN and +From follower coordinators you can only execute `SHOW INSTANCES`. Registration of data instance, unregistration of data instances, demoting instance, setting instance to main and force resetting cluster state are all disabled. ## Instances' restart ### Data instances' restart -Data instances can fail both as MAIN and as REPLICA. When an instance that was REPLICA comes back, it won't accept updates from any instance until the coordinator updates its -responsible peer. This should happen automatically when the coordinator's ping to the instance passes. When the MAIN instance comes back, any writing to the MAIN instance will be + +Data instances can fail both as main and as replica. When an instance that was replica comes back, it won't accept updates from any instance until the coordinator updates its +responsible peer. This should happen automatically when the coordinator's ping to the instance passes. When the main instance comes back, any writing to the main instance will be forbidden until a ping from the coordinator passes. ### Coordinator instances restart @@ -612,7 +613,7 @@ It will also recover the following server config information: The following information will be recovered from a common RocksDB `logs` instance: - current version of `logs` durability store - snapshots found with `snapshot_id_` prefix in database: - - coordinator cluster state - all data instances with their role (MAIN or REPLICA), all coordinator instances and UUID of MAIN instance which REPLICA is listening to + - coordinator cluster state - all data instances with their role (main or replica), all coordinator instances and UUID of main instance which replica is listening to - last log idx - last log term - last cluster config @@ -645,12 +646,16 @@ Raft is a quorum-based protocol and it needs a majority of instances alive in or the cluster stays available. With 2+ coordinator instances down (in a cluster with RF = 3), the RTO depends on the time needed for instances to come back. -Failure of REPLICA data instance isn't very harmful since users can continue writing to MAIN data instance while reading from MAIN or other -REPLICAs. The most important thing to analyze is what happens when MAIN gets down. In that case, the leader coordinator uses +Depending on the replica's replication mode, its failure can lead to different situations. If the replica was registered with STRICT_SYNC mode, then on its failure, writing +on main will be disabled. On the other hand, if replica was registered as ASYNC or SYNC, further writes on main are still allowed. In both cases, reads are still allowed from +main and other replicas. + + +The most important thing to analyze is what happens when main gets down. In that case, the leader coordinator uses user-controllable parameters related to the frequency of health checks from the leader to replication instances (`--instance-health-check-frequency-sec`) -and the time needed to realize the instance is down (`--instance-down-timeout-sec`). After collecting enough evidence, the leader concludes the MAIN is down and performs failover +and the time needed to realize the instance is down (`--instance-down-timeout-sec`). After collecting enough evidence, the leader concludes the main is down and performs failover using just a handful of RPC messages (correct time depends on the distance between instances). It is important to mention that the whole failover is performed without the loss of committed data -if the newly chosen MAIN (previously REPLICA) had all up-to-date data. +if the newly chosen main (previously replica) had all up-to-date data. ## Raft configuration parameters @@ -695,7 +700,7 @@ ADD COORDINATOR 3 WITH CONFIG {"bolt_server": "localhost:7693", "coordinator_ser REGISTER INSTANCE instance_1 WITH CONFIG {"bolt_server": "localhost:7687", "management_server": "instance1:13011", "replication_server": "instance1:10001"}; REGISTER INSTANCE instance_2 WITH CONFIG {"bolt_server": "localhost:7688", "management_server": "instance2:13012", "replication_server": "instance2:10002"}; REGISTER INSTANCE instance_3 WITH CONFIG {"bolt_server": "localhost:7689", "management_server": "instance3:13013", "replication_server": "instance3:10003"}; -SET INSTANCE instance_3 TO MAIN; +SET INSTANCE instance_3 TO main; ``` @@ -898,10 +903,10 @@ REGISTER INSTANCE instance_2 WITH CONFIG {"bolt_server": "localhost:7688", "mana REGISTER INSTANCE instance_3 WITH CONFIG {"bolt_server": "localhost:7689", "management_server": "localhost:13013", "replication_server": "localhost:10003"}; ``` -4. Set instance_3 as MAIN: +4. Set instance_3 as main: ```plaintext -SET INSTANCE instance_3 TO MAIN; +SET INSTANCE instance_3 TO main; ``` 5. Connect to the leader coordinator and check cluster state with `SHOW INSTANCES`; @@ -917,8 +922,8 @@ SET INSTANCE instance_3 TO MAIN; ### Check automatic failover -Let's say that the current MAIN instance is down for some reason. After `--instance-down-timeout-sec` seconds, the coordinator will realize -that and automatically promote the first alive REPLICA to become the new MAIN. The output of running `SHOW INSTANCES` on the leader coordinator could then look like: +Let's say that the current main instance is down for some reason. After `--instance-down-timeout-sec` seconds, the coordinator will realize +that and automatically promote the first alive replica to become the new main. The output of running `SHOW INSTANCES` on the leader coordinator could then look like: | name | bolt_server | coordinator_server | management_server | health | role | last_succ_resp_ms | | ------------- | -------------- | ------------------ | ----------------- | ------ | -------- | ------------------| diff --git a/pages/clustering/replication.mdx b/pages/clustering/replication.mdx index efd1817bf..9ce205d61 100644 --- a/pages/clustering/replication.mdx +++ b/pages/clustering/replication.mdx @@ -71,20 +71,29 @@ cluster. Once demoted to REPLICA instances, they will no longer accept write queries. In order to start the replication, each REPLICA instance needs to be registered -from the MAIN instance by setting a replication mode (SYNC or ASYNC) and +from the MAIN instance by setting a replication mode (SYNC, ASYNC or STRICT_SYNC) and specifying the REPLICA instance's socket address. The replication mode defines the terms by which the MAIN instance can commit the changes to the database, thus modifying the system to prioritize either consistency or availability: -- **SYNC** - After committing a transaction, the MAIN instance will communicate -the changes to all REPLICA instances running in SYNC mode and wait until it -receives a response or information that a timeout is reached. SYNC mode ensures + +- **STRICT_SYNC** - After committing a transaction, the MAIN instance will communicate +the changes to all REPLICA instances and wait until it +receives a response or information that a timeout is reached. The STRICT_SYNC mode ensures consistency and partition tolerance (CP), but not availability for writes. If the primary database has multiple replicas, the system is highly available for reads. But, when a replica fails, the MAIN instance can't process the write due -to the nature of synchronous replication. +to the nature of synchronous replication. It is implemented as two-phase commit protocol. + + +- **SYNC** - After committing a transaction, the MAIN instance will communicate +the changes to all REPLICA instances and wait until it +receives a response or information that a timeout is reached. It is different from +**STRICT_SYNC** mode because it the MAIN can continue committing even in situations +when **SYNC** replica is down. + - **ASYNC** - The MAIN instance will commit a transaction without receiving confirmation from REPLICA instances that they have received the same @@ -234,6 +243,14 @@ the following query: REGISTER REPLICA name ASYNC TO ; ``` + +If you want to register a REPLICA instance with an STRICT_SYNC replication mode, run +the following query: + +```plaintext +REGISTER REPLICA name STRICT_SYNC TO ; +``` + The socket address must be a string value as follows: ```plaintext @@ -282,8 +299,7 @@ If you set REPLICA roles using port `10000`, you can define the socket address s When a REPLICA instance is registered, it will start replication in ASYNC mode until it synchronizes to the current state of the database. Upon -synchronization, REPLICA instances will either continue working in the ASYNC -mode or reset to SYNC mode. +synchronization, REPLICA instances will either continue working in the ASYNC, STRICT_SYNC or SYNC mode. ### Listing all registered REPLICA instances @@ -493,10 +509,12 @@ accepts read and write queries to the database and REPLICA instances accept only read queries. The changes or state of the MAIN instance are replicated to the REPLICA -instances in a SYNC or ASYNC mode. The SYNC mode ensures consistency and +instances in a SYNC, STRICT_SYNC or ASYNC mode. The STRICT_SYNC mode ensures consistency and partition tolerance (CP), but not availability for writes. The ASYNC mode ensures system availability and partition tolerance (AP), while data can only be -eventually consistent. +eventually consistent. The SYNC mode is something in between because it waits +for writes to be accepted on replicas but MAIN can still commit even in situations +when one of REPLICAs is down. By using the timestamp, the MAIN instance knows the current state of the REPLICA. If the REPLICA is not synchronized with the MAIN instance, the MAIN @@ -552,6 +570,17 @@ SYNC REPLICA doesn't answer within the expected timeout. ![](/pages/clustering/replication/workflow_diagram_data_manipulation.drawio.png) + +#### STRICT_SYNC replication mode + +The STRICT_SYNC replication mode behaves very similarly to a +SYNC mode except that MAIN won't commit a transaction locally in a situation in +which one of STRICT_SYNC replicas is down. To achieve that, all instances run +together a two-commit protocol which allows you such a synchronization. This +reduces the throughout but such a mode is super useful in a high-availability +scenario in which a failover is the most operation to support. Such a mode then +allows you a failover without the fear of experiencing a data loss. + #### ASYN replication mode In the ASYNC replication mode, the MAIN instance will commit a transaction @@ -571,6 +600,7 @@ instance. ASYNC mode ensures system availability and partition tolerance. + ### Synchronizing instances By comparing timestamps, the MAIN instance knows when a REPLICA instance is not From ecfb2dc9f4a1de055ea92f7d68699a54ca8609d4 Mon Sep 17 00:00:00 2001 From: David Ivekovic Date: Thu, 21 Aug 2025 09:31:56 +0200 Subject: [PATCH 04/13] update text index docs (#1368) Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/querying/text-search.mdx | 58 ++++++++++++++++++++-------------- 1 file changed, 35 insertions(+), 23 deletions(-) diff --git a/pages/querying/text-search.mdx b/pages/querying/text-search.mdx index 386af74fc..e3c3f14e1 100644 --- a/pages/querying/text-search.mdx +++ b/pages/querying/text-search.mdx @@ -13,11 +13,6 @@ Text search is an [experimental feature](/database-management/experimental-features) introduced in Memgraph 2.15.1. To use it, start Memgraph with the `--experimental-enabled=text-search` flag. - -Make sure to start a fresh instance and then load your data. Snapshots and WALs -created with version `vX.Y.Z` without the experimental flag are currently -incompatible with the same version `vX.Y.Z` when the experimental flag is -enabled. Text search allows you to look up nodes with properties that contain specific content. @@ -33,18 +28,42 @@ Text indices and search are powered by the Text indices are created with the `CREATE TEXT INDEX` command. You need to give a name to the new index and specify which labels it should apply to. +### Index all properties + This statement creates a text index named `complianceDocuments` for nodes with -the `Report` label: +the `Report` label, indexing all text-indexable properties: ```cypher CREATE TEXT INDEX complianceDocuments ON :Report; ``` +### Index specific properties + +You can also create a text index on a subset of properties by specifying them explicitly: + +```cypher +CREATE TEXT INDEX index_name ON :Label(prop1, prop2, prop3); +``` + +For example, to create an index only on the `title` and `content` properties of `Report` nodes: + +```cypher +CREATE TEXT INDEX complianceDocuments ON :Report(title, content); +``` + If you attempt to create an index with an existing name, the statement will fail. ### What is indexed -For any given node, if a text index applies to it, all its properties with text-indexable types (`String`, `Integer`, `Float`, or `Boolean`) are stored. +For any given node, if a text index applies to it: +- When no specific properties are listed, all properties with text-indexable types (`String`, `Integer`, `Float`, or `Boolean`) are stored. +- When specific properties are listed, only those properties (if they have text-indexable types) are stored. + + + +Changes made within the same transaction are not visible to the index. To see your changes in text search results, you need to commit the transaction first. + + ## Show text indices @@ -199,20 +218,13 @@ fail. ## Compatibility -Being an experimental feature, text search only supports some usage modalities -that are available in Memgraph. Refer to the table below for an overview: - -| Feature | Support | -|-------------------------|-----------| -| Multitenancy | yes | -| Durability | yes | -| Storage modes | yes (all) | -| Replication | no | -| Concurrent transactions | no | - - - -Disclaimer: For now, text search is not guaranteed to work correctly in use -cases that involve concurrent transactions and replication. +Even though text search is an experimental feature, it supports most usage modalities +that are available in Memgraph from version 3.5. Refer to the table below for an overview: - \ No newline at end of file +| Feature | Support | +|-------------------------|--------------------------------------------- | +| Multitenancy | βœ… Yes | +| Durability | βœ… Yes | +| Storage modes | ❌ No (doesn't work in IN_MEMORY_ANALYTICAL) | +| Replication | βœ… Yes (from version 3.5) | +| Concurrent transactions | βœ… Yes (from version 3.5) | \ No newline at end of file From 528f5bde95529579ff441fd1c1f9f95b882eabb1 Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Thu, 21 Aug 2025 10:55:01 +0200 Subject: [PATCH 05/13] Start MG charts without root access (#1360) * Support running charts without root access * Update docs table --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- .../install-memgraph/kubernetes.mdx | 204 ++++++++++++------ 1 file changed, 137 insertions(+), 67 deletions(-) diff --git a/pages/getting-started/install-memgraph/kubernetes.mdx b/pages/getting-started/install-memgraph/kubernetes.mdx index 815171f10..ff9d5ef8a 100644 --- a/pages/getting-started/install-memgraph/kubernetes.mdx +++ b/pages/getting-started/install-memgraph/kubernetes.mdx @@ -165,6 +165,35 @@ want to use. Using the latest tag can lead to issues, as a pod restart may pull a newer image, potentially causing unexpected changes or incompatibilities. +### Install Memgraph standalone chart with `minikube` + +If you are installing Memgraph standalone chart locally with `minikube`, we are strongly recommending to enable `csi-hostpath-driver` and use its storage class. Otherwise, +you could have problems with attaching PVCs to pods. + +1. Enable `csi-hostpath-driver` +``` +minikube addons disable storage-provisioner +minikube addons disable default-storageclass +minikube addons enable volumesnapshots +minikube addons enable csi-hostpath-driver +``` + +2. Create a storage class with `csi-hostpath-driver` as a provider (file sc.yaml) + +``` +apiVersion: storage.k8s.io/v1 +kind: StorageClass +metadata: + name: csi-hostpath-delayed +provisioner: hostpath.csi.k8s.io +volumeBindingMode: WaitForFirstConsumer +reclaimPolicy: Delete +``` + +3. `kubectl apply -f sc.yaml` + +4. Set `storageClassName` to `csi-hostpath-delayed` in `values.yaml` + #### Access Memgraph Once Memgraph is installed, you can access it using the provided services and @@ -177,71 +206,81 @@ Lab](/data-visualization). The following table lists the configurable parameters of the Memgraph chart and their default values. -| Parameter | Description | Default | -| ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------- | -| `image.repository` | Memgraph Docker image repository | `memgraph/memgraph` | -| `image.tag` | Specific tag for the Memgraph Docker image. Overrides the image tag whose default is chart version. | `""` (Defaults to chart's app version) | -| `image.pullPolicy` | Image pull policy | `IfNotPresent` | -| `useImagePullSecrets` | Override the default imagePullSecrets | `false` | -| `imagePullSecrets` | Specify image pull secrets | `- name: regcred` | -| `replicaCount` | Number of Memgraph instances to run. Note: no replication or HA support. | `1` | -| `affinity.nodeKey` | Key for node affinity (Preferred) | `""` | -| `affinity.nodeValue` | Value for node affinity (Preferred) | `""` | -| `nodeSelector` | Constrain which nodes your Memgraph pod is eligible to be scheduled on, based on the labels on the nodes. Left empty by default. | `{}` | -| `service.type` | Kubernetes service type | `ClusterIP` | -| `service.enableBolt` | Enable Bolt protocol | `true` | -| `service.boltPort` | Bolt protocol port | `7687` | -| `service.enableWebsocketMonitoring` | Enable WebSocket monitoring | `false` | -| `service.websocketPortMonitoring` | WebSocket monitoring port | `7444` | -| `service.enableHttpMonitoring` | Enable HTTP monitoring | `false` | -| `service.httpPortMonitoring` | HTTP monitoring port | `9091` | -| `service.annotations` | Annotations to add to the service | `{}` | -| `service.labels` | Labels to add to the service | `{}` | -| `persistentVolumeClaim.createStorageClaim` | Enable creation of a Persistent Volume Claim for storage | `true` | -| `persistentVolumeClaim.storageClassName` | Storage class name for the persistent volume claim | `""` | -| `persistentVolumeClaim.storageSize` | Size of the persistent volume claim for storage | `10Gi` | -| `persistentVolumeClaim.existingClaim` | Use an existing Persistent Volume Claim | `memgraph-0` | -| `persistentVolumeClaim.storageVolumeName` | Name of an existing Volume to create a PVC for | `""` | -| `persistentVolumeClaim.createLogStorage` | Enable creation of a Persistent Volume Claim for logs | `true` | -| `persistentVolumeClaim.logStorageClassName` | Storage class name for the persistent volume claim for logs | `""` | -| `persistentVolumeClaim.logStorageSize` | Size of the persistent volume claim for logs | `1Gi` | -| `memgraphConfig` | List of strings defining Memgraph configuration settings | `["--also-log-to-stderr=true"]` | -| `secrets.enabled` | Enable the use of Kubernetes secrets for Memgraph credentials | `false` | -| `secrets.name` | The name of the Kubernetes secret containing Memgraph credentials | `memgraph-secrets` | -| `secrets.userKey` | The key in the Kubernetes secret for the Memgraph user, the value is passed to the `MEMGRAPH_USER` env | `USER` | -| `secrets.passwordKey` | The key in the Kubernetes secret for the Memgraph password, the value is passed to the `MEMGRAPH_PASSWORD` | `PASSWORD` | -| `memgraphEnterpriseLicense` | Memgraph Enterprise License | `""` | -| `memgraphOrganizationName` | Organization name for Memgraph Enterprise License | `""` | -| `statefulSetAnnotations` | Annotations to add to the stateful set | `{}` | -| `podAnnotations` | Annotations to add to the pod | `{}` | -| `resources` | CPU/Memory resource requests/limits. Left empty by default. | `{}` | -| `tolerations` | A toleration is applied to a pod and allows the pod to be scheduled on nodes with matching taints. Left empty by default. | `[]` | -| `serviceAccount.create` | Specifies whether a service account should be created | `true` | -| `serviceAccount.annotations` | Annotations to add to the service account | `{}` | -| `serviceAccount.name` | The name of the service account to use. If not set and create is true, a name is generated. | `""` | -| `container.terminationGracePeriodSeconds` | Grace period for pod termination | `1800` | -| `container.livenessProbe.tcpSocket.port` | Port used for TCP connection. Should be the same as bolt port. | `7687` | -| `container.livenessProbe.failureThreshold` | Failure threshold for liveness probe | `20` | -| `container.livenessProbe.timeoutSeconds` | Initial delay for readiness probe | `10` | -| `container.livenessProbe.periodSeconds` | Period seconds for readiness probe | `5` | -| `container.readinessProbe.tcpSocket.port` | Port used for TCP connection. Should be the same as bolt port. | `7687` | -| `container.readinessProbe.failureThreshold` | Failure threshold for readiness probe | `20` | -| `container.readinessProbe.timeoutSeconds` | Initial delay for readiness probe | `10` | -| `container.readinessProbe.periodSeconds` | Period seconds for readiness probe | `5` | -| `container.startupProbe.tcpSocket.port` | Port used for TCP connection. Should be the same as bolt port. | `7687` | -| `container.startupProbe.failureThreshold` | Failure threshold for startup probe | `1440` | -| `container.startupProbe.periodSeconds` | Period seconds for startup probe | `10` | -| `nodeSelectors` | Node selectors for pod. Left empty by default. | `{}` | -| `customQueryModules` | List of custom Query modules that should be mounted to Memgraph Pod | `[]` | -| `sysctlInitContainer.enabled` | Enable the init container to set sysctl parameters | `true` | -| `sysctlInitContainer.maxMapCount` | Value for `vm.max_map_count` to be set by the init container | `262144` | -| `storageClass.create` | If set to true, new StorageClass will be created. | `false` | -| `storageClass.name` | Name of the StorageClass | `"memgraph-generic-storage-class"` | -| `storageClass.provisioner` | Provisioner for the StorageClass | `""` | -| `storageClass.storageType` | Type of storage for the StorageClass | `""` | -| `storageClass.fsType` | Filesystem type for the StorageClass | `""` | -| `storageClass.reclaimPolicy` | Reclaim policy for the StorageClass | `Retain` | -| `storageClass.volumeBindingMode` | Volume binding mode for the StorageClass | `Immediate` | +| Parameter | Description | Default | +| --------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------- | +| `image.repository` | Memgraph Docker image repository | `memgraph/memgraph` | +| `image.tag` | Specific tag for the Memgraph Docker image. Overrides the image tag whose default is chart version. | `""` (Defaults to chart's app version) | +| `image.pullPolicy` | Image pull policy | `IfNotPresent` | +| `memgraphUserId` | The user id that is hardcoded in Memgraph and Mage images | `101` | +| `memgraphGroupId` | The group id that is hardcoded in Memgraph and Mage images | `103` | +| `useImagePullSecrets` | Override the default imagePullSecrets | `false` | +| `imagePullSecrets` | Specify image pull secrets | `- name: regcred` | +| `replicaCount` | Number of Memgraph instances to run. Note: no replication or HA support. | `1` | +| `affinity.nodeKey` | Key for node affinity (Preferred) | `""` | +| `affinity.nodeValue` | Value for node affinity (Preferred) | `""` | +| `nodeSelector` | Constrain which nodes your Memgraph pod is eligible to be scheduled on, based on the labels on the nodes. Left empty by default. | `{}` | +| `service.type` | Kubernetes service type | `ClusterIP` | +| `service.enableBolt` | Enable Bolt protocol | `true` | +| `service.boltPort` | Bolt protocol port | `7687` | +| `service.enableWebsocketMonitoring` | Enable WebSocket monitoring | `false` | +| `service.websocketPortMonitoring` | WebSocket monitoring port | `7444` | +| `service.enableHttpMonitoring` | Enable HTTP monitoring | `false` | +| `service.httpPortMonitoring` | HTTP monitoring port | `9091` | +| `service.annotations` | Annotations to add to the service | `{}` | +| `service.labels` | Labels to add to the service | `{}` | +| `persistentVolumeClaim.createStorageClaim` | Enable creation of a Persistent Volume Claim for storage | `true` | +| `persistentVolumeClaim.storageClassName` | Storage class name for the persistent volume claim | `""` | +| `persistentVolumeClaim.storageSize` | Size of the persistent volume claim for storage | `10Gi` | +| `persistentVolumeClaim.existingClaim` | Use an existing Persistent Volume Claim | `memgraph-0` | +| `persistentVolumeClaim.storageVolumeName` | Name of an existing Volume to create a PVC for | `""` | +| `persistentVolumeClaim.createLogStorage` | Enable creation of a Persistent Volume Claim for logs | `true` | +| `persistentVolumeClaim.logStorageClassName` | Storage class name for the persistent volume claim for logs | `""` | +| `persistentVolumeClaim.logStorageSize` | Size of the persistent volume claim for logs | `1Gi` | +| `persistentVolumeClaim.createUserClaim` | Create a Dynamic Persistant Volume Claim for Configs, Certificates (e.g. Bolt cert ) and rest of User related files | `false` | +| `persistentVolumeClaim.userStorageClassName` | Storage class name for the persistent volume claim for user storage | `""` | +| `persistentVolumeClaim.userStorageSize` | Size of the persistent volume claim for user storage | `1Gi` | +| `persistentVolumeClaim.userStorageAccessMode` | Storage Class Access Mode. If you need a different pod to add data into Memgraph (e.g. CSV files) set this to "ReadWriteMany" | `ReadWriteOnce` | +| `persistentVolumeClaim.userMountPath` | Where to mount the `userStorageClass` you should set this variable if you are enabling the `UserClaim` | `""` | +| `memgraphConfig` | List of strings defining Memgraph configuration settings | `["--also-log-to-stderr=true"]` | +| `secrets.enabled` | Enable the use of Kubernetes secrets for Memgraph credentials | `false` | +| `secrets.name` | The name of the Kubernetes secret containing Memgraph credentials | `memgraph-secrets` | +| `secrets.userKey` | The key in the Kubernetes secret for the Memgraph user, the value is passed to the `MEMGRAPH_USER` env | `USER` | +| `secrets.passwordKey` | The key in the Kubernetes secret for the Memgraph password, the value is passed to the `MEMGRAPH_PASSWORD` | `PASSWORD` | +| `memgraphEnterpriseLicense` | Memgraph Enterprise License | `""` | +| `memgraphOrganizationName` | Organization name for Memgraph Enterprise License | `""` | +| `statefulSetAnnotations` | Annotations to add to the stateful set | `{}` | +| `podAnnotations` | Annotations to add to the pod | `{}` | +| `resources` | CPU/Memory resource requests/limits. Left empty by default. | `{}` | +| `tolerations` | A toleration is applied to a pod and allows the pod to be scheduled on nodes with matching taints. Left empty by default. | `[]` | +| `serviceAccount.create` | Specifies whether a service account should be created | `true` | +| `serviceAccount.annotations` | Annotations to add to the service account | `{}` | +| `serviceAccount.name` | The name of the service account to use. If not set and create is true, a name is generated. | `""` | +| `container.terminationGracePeriodSeconds` | Grace period for pod termination | `1800` | +| `container.livenessProbe.tcpSocket.port` | Port used for TCP connection. Should be the same as bolt port. | `7687` | +| `container.livenessProbe.failureThreshold` | Failure threshold for liveness probe | `20` | +| `container.livenessProbe.timeoutSeconds` | Initial delay for readiness probe | `10` | +| `container.livenessProbe.periodSeconds` | Period seconds for readiness probe | `5` | +| `container.readinessProbe.tcpSocket.port` | Port used for TCP connection. Should be the same as bolt port. | `7687` | +| `container.readinessProbe.failureThreshold` | Failure threshold for readiness probe | `20` | +| `container.readinessProbe.timeoutSeconds` | Initial delay for readiness probe | `10` | +| `container.readinessProbe.periodSeconds` | Period seconds for readiness probe | `5` | +| `container.startupProbe.tcpSocket.port` | Port used for TCP connection. Should be the same as bolt port. | `7687` | +| `container.startupProbe.failureThreshold` | Failure threshold for startup probe | `1440` | +| `container.startupProbe.periodSeconds` | Period seconds for startup probe | `10` | +| `nodeSelectors` | Node selectors for pod. Left empty by default. | `{}` | +| `customQueryModules` | List of custom Query modules that should be mounted to Memgraph Pod | `[]` | +| `storageClass.create` | If set to true, new StorageClass will be created. | `false` | +| `storageClass.name` | Name of the StorageClass | `"memgraph-generic-storage-class"` | +| `storageClass.provisioner` | Provisioner for the StorageClass | `""` | +| `storageClass.storageType` | Type of storage for the StorageClass | `""` | +| `storageClass.fsType` | Filesystem type for the StorageClass | `""` | +| `storageClass.reclaimPolicy` | Reclaim policy for the StorageClass | `Retain` | +| `storageClass.volumeBindingMode` | Volume binding mode for the StorageClass | `Immediate` | +| `sysctlInitContainer.enabled` | Enable the init container to set sysctl parameters | `true` | +| `sysctlInitContainer.maxMapCount` | Value for `vm.max_map_count` to be set by the init container | `262144` | +| `sysctlInitContainer.image.repository` | Busybox image repository | `library/busybox` | +| `sysctlInitContainer.image.tag` | Specific tag for the Busybox Docker image | `latest` | +| `sysctlInitContainer.image.pullPolicy` | Image pull policy for busybox | `IfNotPresent` | To change the default chart values, provide your own `values.yaml` file during the installation: @@ -329,6 +368,36 @@ want to use. Using the latest tag can lead to issues, as a pod restart may pull a newer image, potentially causing unexpected changes or incompatibilities. +### Install Memgraph HA chart with `minikube` + +If you are installing Memgraph HA chart locally with `minikube`, we are strongly recommending to enable `csi-hostpath-driver` and use its storage class. Otherwise, +you could have problems with attaching PVCs to pods. + +1. Enable `csi-hostpath-driver` +``` +minikube addons disable storage-provisioner +minikube addons disable default-storageclass +minikube addons enable volumesnapshots +minikube addons enable csi-hostpath-driver +``` + +2. Create a storage class with `csi-hostpath-driver` as a provider (file sc.yaml) + +``` +apiVersion: storage.k8s.io/v1 +kind: StorageClass +metadata: + name: csi-hostpath-delayed +provisioner: hostpath.csi.k8s.io +volumeBindingMode: WaitForFirstConsumer +reclaimPolicy: Delete +``` + +3. `kubectl apply -f sc.yaml` + +4. Set `libStorageClassName` to `csi-hostpath-delayed` in `values.yaml` + + ### Changing the default chart values To change the default chart values, run the command with the specified set of @@ -369,8 +438,7 @@ Uninstalling the chart won't trigger deletion of persistent volume claims (PVCs) ### Security context -All instances are started as `StatefulSet` with one pod. The pod has two or three containers depending on whether the sysctlInitContainer.enabled is used. The **init** container -is used to set permissions on volume mounts. It is used as root user with `CHOWN` capability and without privileged access. The **memgraph-coordinator** container is the one which +All instances are started as `StatefulSet` with one pod. The pod has two or three containers depending on whether the sysctlInitContainer.enabled is used. The **memgraph-coordinator** container is the one which actually runs Memgraph image. The process is run by non-root **memgraph** user without any Linux capabilities. Privileges cannot escalate. ### High availability storage @@ -637,6 +705,8 @@ The following table lists the configurable parameters of the Memgraph HA chart a | `image.pullPolicy` | Image pull policy | `IfNotPresent` | | `env.MEMGRAPH_ENTERPRISE_LICENSE` | Memgraph enterprise license | `` | | `env.MEMGRAPH_ORGANIZATION_NAME` | Organization name | `` | +| `memgraphUserId` | The user id that is hardcoded in Memgraph and Mage images | `101` | +| `memgraphGroupId` | The group id that is hardcoded in Memgraph and Mage images | `103` | | `storage.libPVCSize` | Size of the storage PVC | `1Gi` | | `storage.libStorageClassName` | The name of the storage class used for storing data. | `""` | | `storage.libStorageAccessMode` | Access mode used for lib storage. | `ReadWriteOnce` | From ea9158c5297b88c98b42e08b66e18cbf3bafed9f Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Thu, 21 Aug 2025 11:14:52 +0200 Subject: [PATCH 06/13] Add replication lag (#1357) Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/clustering/high-availability.mdx | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/pages/clustering/high-availability.mdx b/pages/clustering/high-availability.mdx index 7cfb9da23..fa0d39f1f 100644 --- a/pages/clustering/high-availability.mdx +++ b/pages/clustering/high-availability.mdx @@ -295,7 +295,6 @@ e.g during a maintenance work on the instance where the current main is deployed - ### Unregister instance There are various reasons which could lead to the decision that an instance needs to be removed from the cluster. The hardware can be broken, @@ -366,6 +365,16 @@ This query will return the information about: If the query `ADD COORDINATOR` wasn't run for the current instance, the value of the bolt server will be "". +### Show replication lag + +The user can find the current replication lag on each instance by running `SHOW REPLICATION LAG` on the cluster's leader. The replication lag is expressed with +the number of committed transactions. Such an info is made durable through snapshots and WALs so restarts won't cause the information loss. The information +about the replication lag can be useful when manually performing a failover to check whether there is a risk of a data loss. + +```plaintext +SHOW REPLICATION LAG; +``` + ## Setting config for highly-available cluster From 255c2be6fed55defc4fee13001b3c62a7aba4dff Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Thu, 21 Aug 2025 12:18:10 +0200 Subject: [PATCH 07/13] Add custom init containers (#1362) Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/getting-started/install-memgraph/kubernetes.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/pages/getting-started/install-memgraph/kubernetes.mdx b/pages/getting-started/install-memgraph/kubernetes.mdx index ff9d5ef8a..2d82b5ff3 100644 --- a/pages/getting-started/install-memgraph/kubernetes.mdx +++ b/pages/getting-started/install-memgraph/kubernetes.mdx @@ -276,6 +276,7 @@ their default values. | `storageClass.fsType` | Filesystem type for the StorageClass | `""` | | `storageClass.reclaimPolicy` | Reclaim policy for the StorageClass | `Retain` | | `storageClass.volumeBindingMode` | Volume binding mode for the StorageClass | `Immediate` | +| `initContainers` | User specific init containers | `[]` | | `sysctlInitContainer.enabled` | Enable the init container to set sysctl parameters | `true` | | `sysctlInitContainer.maxMapCount` | Value for `vm.max_map_count` to be set by the init container | `262144` | | `sysctlInitContainer.image.repository` | Busybox image repository | `library/busybox` | From dbde27432a269f49db89c5105d7cd800e0a84d0d Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Thu, 21 Aug 2025 12:28:57 +0200 Subject: [PATCH 08/13] Add support for specifying update strategy on HA chart (#1363) Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/getting-started/install-memgraph/kubernetes.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/pages/getting-started/install-memgraph/kubernetes.mdx b/pages/getting-started/install-memgraph/kubernetes.mdx index 2d82b5ff3..882b9d1d8 100644 --- a/pages/getting-started/install-memgraph/kubernetes.mdx +++ b/pages/getting-started/install-memgraph/kubernetes.mdx @@ -771,6 +771,7 @@ The following table lists the configurable parameters of the Memgraph HA chart a | `prometheus.memgraphExporter.tag` | The tag of Memgraph's Prometheus exporter image. | `0.2.1` | | `prometheus.serviceMonitor.kubePrometheusStackReleaseName` | The release name under which `kube-prometheus-stack` chart is installed. | `kube-prometheus-stack` | | `prometheus.serviceMonitor.interval` | How often will Prometheus pull data from Memgraph's Prometheus exporter. | `15s` | +| `updateStrategy.type` | Update strategy for StatefulSets. Possible values are `RollingUpdate` and `OnDelete` | `RollingUpdate` | For the `data` and `coordinators` sections, each item in the list has the following parameters: From d9b8dd7862a112906cfff4883bcdb07309b0bda5 Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Thu, 21 Aug 2025 12:45:40 +0200 Subject: [PATCH 09/13] Add labels support (#1365) Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/getting-started/install-memgraph/kubernetes.mdx | 3 +++ 1 file changed, 3 insertions(+) diff --git a/pages/getting-started/install-memgraph/kubernetes.mdx b/pages/getting-started/install-memgraph/kubernetes.mdx index 882b9d1d8..688d2b95c 100644 --- a/pages/getting-started/install-memgraph/kubernetes.mdx +++ b/pages/getting-started/install-memgraph/kubernetes.mdx @@ -771,6 +771,9 @@ The following table lists the configurable parameters of the Memgraph HA chart a | `prometheus.memgraphExporter.tag` | The tag of Memgraph's Prometheus exporter image. | `0.2.1` | | `prometheus.serviceMonitor.kubePrometheusStackReleaseName` | The release name under which `kube-prometheus-stack` chart is installed. | `kube-prometheus-stack` | | `prometheus.serviceMonitor.interval` | How often will Prometheus pull data from Memgraph's Prometheus exporter. | `15s` | +| `labels.coordinators.podLabels` | Enables you to set labels on a pod level. | `{}` | +| `labels.coordinators.statefulSetLabels` | Enables you to set labels on a stateful set level. | `{}` | +| `labels.coordinators.serviceLabels` | Enables you to set labels on a service level. | `{}` | | `updateStrategy.type` | Update strategy for StatefulSets. Possible values are `RollingUpdate` and `OnDelete` | `RollingUpdate` | From 1149ed958a15c40ac2d0419cabe3c63055c8ff1d Mon Sep 17 00:00:00 2001 From: Gareth Andrew Lloyd Date: Thu, 21 Aug 2025 12:39:00 +0100 Subject: [PATCH 10/13] Auto-index and TTL (#1371) * Auto-index and TTL * Replication fixes * Update pages/querying/time-to-live.mdx --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/fundamentals/indexes.mdx | 10 ++++++---- pages/querying/time-to-live.mdx | 28 ++++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 8 deletions(-) diff --git a/pages/fundamentals/indexes.mdx b/pages/fundamentals/indexes.mdx index 9f60dc098..0bb9e964e 100644 --- a/pages/fundamentals/indexes.mdx +++ b/pages/fundamentals/indexes.mdx @@ -701,7 +701,11 @@ and [`storage-automatic-edge-type-index-creation-enabled`](/configuration/configuration-settings#storage) flags, it is possible to create label and edge-type indices automatically. Every time the database encounters a label or edge-type that is currently not indexed, -it will create an index for that construct. +it will queue an index creation request that runs asynchronously in the background. + +Auto-index creation operates with the following characteristics: +- **Asynchronous execution**: Index creation runs in dedicated background transaction +- **Concurrent creation**: Utilizes the same non-blocking mechanism as manual index creation, allowing user queries to continue uninterrupted ## Recovery @@ -894,9 +898,7 @@ extends how long write queries will retry before failing. The system maintains **full MVCC consistency** throughout the process, ensuring transactional integrity. Long-running index operations can be safely cancelled -if needed. Currently, some features like replication synchronization and TTL -indices still use blocking mode during operations, though these limitations will -be addressed in future releases. +if needed. For complete technical details about the implementation, consistency guarantees, and current limitations, please refer to the [Concurrent Index Creation diff --git a/pages/querying/time-to-live.mdx b/pages/querying/time-to-live.mdx index 829767493..e14ed7e1e 100644 --- a/pages/querying/time-to-live.mdx +++ b/pages/querying/time-to-live.mdx @@ -11,6 +11,12 @@ Time-to-live allows a user to tag vertices and edges with an expiration time. On +**Breaking change in v3.5.0**: TTL durability from versions before v3.5.0 is not compatible with the new implementation. If you are upgrading from an earlier version, you will need to reconfigure TTL after the upgrade. + + + + + The `TTL` label and `ttl` property are reserved names for TTL. See [Tagging objects](#tagging-objects) for more info. @@ -32,9 +38,15 @@ Once that is done, a background job will periodically delete expired vertices, i ### What is indexed -Time-to-live uses a label `TTL` and property `ttl` to tag vertices. A label+property value index is used to speed up query execution. +Time-to-live uses a label `TTL` and property `ttl` to tag vertices. A label+property value index is created using concurrent index creation to minimize blocking. Edges are tagged using only the `ttl` property and are scanned using the global edge property index. + + +TTL index creation now uses concurrent index creation. Hence, from v3.5.0, TTL query commands are no longer blocked waiting for index to be created. This significantly reduces the impact on database operation. + + + ### Executed query Time-to-live is implemented as a background job that execute the following queries: @@ -187,6 +199,14 @@ Time-to-live configuration is tenant based; meaning that the feature will need t ### Replication -Time-to-live background job will be execute only on MAIN and the changes will be replicated. -While the TTL effect is replicated, the configuration is not. TTL needs to be configured manually on every instance that can become MAIN. -If an instance is a REPLICA, the TTL background job will be paused until the instance becomes MAIN. +Time-to-live is fully integrated with replication: +- The TTL background job executes only on the MAIN instance +- All TTL deletions are automatically replicated to REPLICA instances +- TTL configuration is now properly replicated across instances +- If an instance is a REPLICA, the TTL background job will be paused until the instance becomes MAIN + + + +Starting from v3.5.0, TTL fully supports replication. The TTL operations are performed at the storage level and are automatically synchronized across all replicas. + + From fd315ff9b51440fe77666d5e10e63c3f25b46f34 Mon Sep 17 00:00:00 2001 From: andrejtonev <29177572+andrejtonev@users.noreply.github.com> Date: Thu, 21 Aug 2025 14:10:31 +0200 Subject: [PATCH 11/13] CREATE and SHOW SNAPSHOTS changes (#1370) Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/database-management/backup-and-restore.mdx | 11 ++++++++--- pages/fundamentals/data-durability.mdx | 1 + 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/pages/database-management/backup-and-restore.mdx b/pages/database-management/backup-and-restore.mdx index 769b0d67f..0d62d29f8 100644 --- a/pages/database-management/backup-and-restore.mdx +++ b/pages/database-management/backup-and-restore.mdx @@ -111,9 +111,14 @@ SHOW SNAPSHOTS; ``` Its results contain the path to the file, the logical timestamp, the physical -timestamp and the file size. If the periodic snapshot background job is active, -the first element in the results will define the time at which the snapshots -will be created. +timestamp and the file size. + +As of Memgraph v3.5, the `SHOW SNAPSHOTS` query does not return information regarding the next scheduled snapshot. +A special query has been added: +``` +SHOW NEXT SNAPSHOT; +``` +If the periodic snapshot background job is active, the result will return the path and the time at which the snapshots will be created. If you are using Memgraph pre v2.22, follow these steps to restore data from a backup: diff --git a/pages/fundamentals/data-durability.mdx b/pages/fundamentals/data-durability.mdx index 933d56e90..ba90845d3 100644 --- a/pages/fundamentals/data-durability.mdx +++ b/pages/fundamentals/data-durability.mdx @@ -119,6 +119,7 @@ If another snapshot is already being created or no committed writes to the datab By default, snapshot files are saved inside the `var/lib/memgraph/snapshots` directory. +The `CREATE SNAPSHOT` query will return the path of the newly created snapshot file. To query which snapshots currently exist in the data directory, execute: ```opencypher From 67925bb5c03014a59ef66e084a2603455c6b3d0d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ivan=20Milinovi=C4=87?= <44698587+imilinovic@users.noreply.github.com> Date: Thu, 21 Aug 2025 14:25:41 +0200 Subject: [PATCH 12/13] Add docs for SSO changes (#1373) * SSO docs * typos * Apply suggestions from code review --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/client-libraries/python.mdx | 25 ++++++++++++++-- .../auth-system-integrations.mdx | 30 +++++++++++++++++++ 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/pages/client-libraries/python.mdx b/pages/client-libraries/python.mdx index 75f668d74..251784b68 100644 --- a/pages/client-libraries/python.mdx +++ b/pages/client-libraries/python.mdx @@ -277,13 +277,19 @@ with GraphDatabase.driver(URI, auth=AUTH) as client: This is currently only supported for OIDC SSO. -To use SSO with the Python driver you need to get the access and id tokens yourself. +To use SSO with the Python driver, you need to get the access token and, optionally, the ID token yourself. One simple way to do it is to use the authlib library and follow the official [tutorial](https://docs.authlib.org/en/latest/client/oauth2.html). To connect to the Memgraph database you have to use the `custom_auth` class with the `scheme` parameter set as `oidc-entra-id`, `oidc-okta` or `oidc-custom` depending on which scheme you are using, -`credentials` parameter set to contain both access and id tokens in the format shown in the example below. Finally set `principal` and `realm` parameters to `None`. +`credentials` parameter set to contain the access token and optionally the ID token in the format shown in the example below. Finally, set `principal` and `realm` parameters to `None`. -Below is an example of connecting to the Memgraph database using OIDC SSO with custom auth scheme. + +The ID token is only required if your username configuration uses a field from the ID token (e.g., `id:sub`). If your username is configured to use a field from the access token (e.g., `access:preferred_username`), you can omit the ID token from the credentials string. + + +Below are examples of connecting to the Memgraph database using OIDC SSO with the custom auth scheme. + +**With both access and ID tokens:** ```python with neo4j.GraphDatabase.driver( "bolt://localhost:7687", @@ -296,6 +302,19 @@ with neo4j.GraphDatabase.driver( ) as driver: ``` +**With access token only (when username is configured to use access token fields):** +```python +with neo4j.GraphDatabase.driver( + "bolt://localhost:7687", + auth=neo4j.custom_auth( + scheme="oidc-custom", + credentials=f"access_token={token['access_token']}", + principal=None, + realm=None, + ) +) as driver: +``` + #### Impersonate a user diff --git a/pages/database-management/authentication-and-authorization/auth-system-integrations.mdx b/pages/database-management/authentication-and-authorization/auth-system-integrations.mdx index f5cd5e6c7..64f0f2810 100644 --- a/pages/database-management/authentication-and-authorization/auth-system-integrations.mdx +++ b/pages/database-management/authentication-and-authorization/auth-system-integrations.mdx @@ -320,9 +320,39 @@ One way to deduce the audience of the access and id tokens is to decode them usi Often time access and id token will the use the same audience. For example in MS Entra ID both tokens use the client ID as audience. +##### Connect via Neo4j drivers + +When connecting through a driver, you can choose to provide only the access token and skip the ID token. +In general, all connection methods follow the same approach: setting only the **scheme** and **credentials**. +- **Scheme**: Use the scheme that applies to your setup (`oidc-entra-id`, `oidc-okta` or `custom`) +- **Credentials**: Provide them as a string in this format: +```access_token=token-data;id_token=token-data``` +If you don't want to include the ID token, simply omit it: +```access_token=token-data``` + + +The OIDC module automatically determines whether to validate the ID token based on your username configuration. If your username is configured to use a field from the ID token (e.g., `id:sub`), the module will require and validate the ID token. If your username uses a field from the access token (e.g., `access:preferred_username`), the ID token validation is skipped. + + +Below is an example of connecting via the Neo4j Python driver. + +```python +from neo4j import GraphDatabase, Auth + +driver = GraphDatabase.driver(MEMGRAPH_URI, + auth = Auth( + scheme="oidc-entra-id", + credentials=`access_token=token-data;id_token=token-data`, + realm=None, + principal=None + ) +) +``` + ##### Username The username variable tells the OIDC module what to use as the username. It has the format `token-type:field`. Token type can be `id` or `access` depending on whether you want to use a field from the access or the ID token for the username. See the following to learn more about [access](https://www.okta.com/identity-101/access-token/) and [id](https://developer.okta.com/docs/guides/validate-id-tokens/main/#id-tokens-vs-access-tokens) tokens. + By default, it is set to `id:sub` as per the OIDC protocol it is recommended to use the `sub` field from the id token as it is non-mutable and globally unique for each application. For Okta one commonly used field is `access:sub` which is usually the email of the user. You can also configure [custom claims](https://developer.okta.com/docs/guides/customize-tokens-returned-from-okta/main/). From adefd9d18fd8a1a70a7366c62701c8b928a833b4 Mon Sep 17 00:00:00 2001 From: andrejtonev <29177572+andrejtonev@users.noreply.github.com> Date: Thu, 21 Aug 2025 14:45:05 +0200 Subject: [PATCH 13/13] K shortest paths (#1369) * touch * Defined algo * details * difference in cypher * update --------- Co-authored-by: matea16 Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> --- pages/advanced-algorithms.mdx | 8 +- .../available-algorithms.mdx | 1 + .../deep-path-traversal.mdx | 104 +++++++++++++++++- .../workloads/memgraph-in-cybersecurity.mdx | 1 + pages/getting-started.mdx | 2 +- pages/querying/best-practices.mdx | 6 +- .../differences-in-cypher-implementations.mdx | 22 +++- 7 files changed, 134 insertions(+), 10 deletions(-) diff --git a/pages/advanced-algorithms.mdx b/pages/advanced-algorithms.mdx index 42374738e..78bc47c58 100644 --- a/pages/advanced-algorithms.mdx +++ b/pages/advanced-algorithms.mdx @@ -13,11 +13,11 @@ If you require procedures designed to solve specific graph problems, there is a number of advanced algorithms available in Memgraph. [BFS](/advanced-algorithms/deep-path-traversal#breadth-first-search), -[DFS](/advanced-algorithms/deep-path-traversal#depth-first-search), -[Weighted shortest -path](/advanced-algorithms/deep-path-traversal#weighted-shortest-path), +[DFS](/advanced-algorithms/deep-path-traversal#depth-first-search), [Weighted +shortest path](/advanced-algorithms/deep-path-traversal#weighted-shortest-path), [All shortest -paths](/advanced-algorithms/deep-path-traversal#all-shortest-paths) are +paths](/advanced-algorithms/deep-path-traversal#all-shortest-paths) and [K +shortest paths](/advanced-algorithms/deep-path-traversal#k-shortest-paths) are built-in deep path traversal algorithms you can run using their specific clauses. diff --git a/pages/advanced-algorithms/available-algorithms.mdx b/pages/advanced-algorithms/available-algorithms.mdx index 1518e04e2..d86e888c6 100644 --- a/pages/advanced-algorithms/available-algorithms.mdx +++ b/pages/advanced-algorithms/available-algorithms.mdx @@ -22,6 +22,7 @@ library](/advanced-algorithms/install-mage). | [Breadth-first search](/advanced-algorithms/deep-path-traversal#breadth-first-search) | C++ | An algorithm for traversing through a graph starting based on nodes' breadth (distance from the source node). | | [Weighted shortest path](/advanced-algorithms/deep-path-traversal#weighted-shortest-path) | C++ | The weighted shortest path problem is the problem of finding a path between two nodes in a graph such that the sum of the weights of relationships connecting nodes, or the sum of the weight of some node property on the path, is minimized. | | [All shortest paths](/advanced-algorithms/deep-path-traversal#all-shortest-paths) | C++ | Finding all shortest paths is an expansion of the weighted shortest paths problem. The goal of finding the shortest path is obtaining any minimum sum of weights on the path from one node to the other. | +| [K shortest paths](/advanced-algorithms/deep-path-traversal#k-shortest-paths) | C++ | Returning K shortest paths between 2 nodes in order of shortest to longest paths. | ## Traditional graph algorithms diff --git a/pages/advanced-algorithms/deep-path-traversal.mdx b/pages/advanced-algorithms/deep-path-traversal.mdx index 27a129713..4702f04bf 100644 --- a/pages/advanced-algorithms/deep-path-traversal.mdx +++ b/pages/advanced-algorithms/deep-path-traversal.mdx @@ -18,6 +18,7 @@ algorithms are built into Memgraph and don't require any additional libraries: * [Breadth-first search (BFS)](#breadth-first-search) * [Weighted shortest path (WSP)](#weighted-shortest-path) * [All shortest paths (ASP)](#all-shortest-paths) + * [K shortest paths (KSP)](#k-shortest-paths) Below you can find examples of how to use these algorithms and you can try them out @@ -27,7 +28,7 @@ Europe backpacking dataset or adjust them to the dataset of your choice. -Memgraph has a lot more graph algorithms to offer besides these four, and they +Memgraph has a lot more graph algorithms to offer besides these five, and they are all a part of [MAGE](/advanced-algorithms/install-mage) - Memgraph Advanced Graph Extensions, an open-source repository that contains graph algorithms and modules written in the form of query modules that can be used to tackle the most @@ -795,4 +796,105 @@ RETURN path, total_weight; current traversal path p, allowing you to filter based on how the current node was reached. + +## K shortest paths + +The K shortest paths algorithm finds K shortest paths between two nodes in order +of increasing length. This algorithm is useful when you need to find alternative +routes or analyze path diversity in a graph. + +### Syntax + +The K shortest paths algorithm uses the `*KSHORTEST` syntax: + +```cypher +MATCH (source), (target) +WITH source, target +MATCH path=(source)-[*KSHORTEST]->(target) +RETURN path; +``` + +You can also limit the number of paths returned using the `|` syntax: + +```cypher +MATCH (source), (target) +WITH source, target +MATCH path=(source)-[*KSHORTEST|3]->(target) +RETURN path; +``` + +This will return at most 3 shortest paths between the source and target nodes. + +### Example + + + + + + ```cypher + MATCH (n) DETACH DELETE n; + CREATE (n1:Node {name: "A"}), (n2:Node {name: "B"}), (n3:Node {name: "C"}), (n4:Node {name: "D"}), (n5:Node {name: "E"}), + (n1)-[:ConnectedTo]->(n2), (n1)-[:ConnectedTo]->(n3), (n2)-[:ConnectedTo]->(n3), + (n2)-[:ConnectedTo]->(n4), (n4)-[:ConnectedTo]->(n3), (n3)-[:ConnectedTo]->(n5); + ``` + + + + ```cypher + MATCH (a:Node {name: "A"}), (e:Node {name: "E"}) + WITH a, e + MATCH path=(a)-[*KSHORTEST]->(e) + RETURN path; + ``` + + + + ```plaintext + +---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ + | path | + +---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ + | (:Node {name: "A"})-[:ConnectedTo]->(:Node {name: "C"})-[:ConnectedTo]->(:Node {name: "E"}) | + | (:Node {name: "A"})-[:ConnectedTo]->(:Node {name: "B"})-[:ConnectedTo]->(:Node {name: "C"})-[:ConnectedTo]->(:Node {name: "E"}) | + | (:Node {name: "A"})-[:ConnectedTo]->(:Node {name: "B"})-[:ConnectedTo]->(:Node {name: "D"})-[:ConnectedTo]->(:Node {name: "C"})-[:ConnectedTo]->(:Node {name: "E"}) | + +---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ + ``` + + + + + +### Path length constraints + +You can constrain the path length using the range syntax: + +```cypher +MATCH (a:Node {name: "A"}), (e:Node {name: "E"}) +WITH a, e +MATCH path=(a)-[*KSHORTEST 2..4]->(e) +RETURN path; +``` + +This will only return paths with a length between 2 and 4 hops (inclusive). + +### When to use K shortest paths? + +Use the K shortest paths algorithm when you need to: +- Find alternative routes between two nodes +- Analyze path diversity in a graph +- Identify backup paths or redundant connections +- Understand the network structure beyond just the shortest path + +### Interaction with other systems + +- **Fine-grained access control**: Supports enterprise access control features +- **Hops limit**: Supports query level hops limit + +### Important limitations + +- **Predefined nodes**: Both source and target nodes must be matched first using + a `WITH` clause +- **No filter lambdas**: K shortest paths does not support user-defined + filtering during expansion + + \ No newline at end of file diff --git a/pages/deployment/workloads/memgraph-in-cybersecurity.mdx b/pages/deployment/workloads/memgraph-in-cybersecurity.mdx index 88a591148..779143bc7 100644 --- a/pages/deployment/workloads/memgraph-in-cybersecurity.mdx +++ b/pages/deployment/workloads/memgraph-in-cybersecurity.mdx @@ -166,6 +166,7 @@ For security use cases involving attack path analysis and threat propagation, Me - **Weighted shortest paths**: Calculate the most likely attack paths based on security metrics (e.g., vulnerability scores, access levels) - **All shortest paths**: Identify all possible attack vectors between critical assets +- **K shortest paths**: Find alternative attack routes and analyze path diversity in security networks - **Path filtering**: Focus analysis on specific types of security relationships or nodes These algorithms are crucial for: diff --git a/pages/getting-started.mdx b/pages/getting-started.mdx index 56ad5c9ed..c90f54cea 100644 --- a/pages/getting-started.mdx +++ b/pages/getting-started.mdx @@ -155,7 +155,7 @@ libraries](/client-libraries) and follow their getting started guide. Memgraph offers a range of procedures tailored to address specific graph problems. Built-in deep path traversal algorithms such as BFS, DFS, Weighted -shortest path, and All shortest paths can be executed using their specific +shortest path, All shortest paths, and K shortest paths can be executed using their specific clauses. Memgraph comes with expanded set of algorithms called [Memgraph Advanced Graph diff --git a/pages/querying/best-practices.mdx b/pages/querying/best-practices.mdx index 1ba7ce612..077732a95 100644 --- a/pages/querying/best-practices.mdx +++ b/pages/querying/best-practices.mdx @@ -598,9 +598,9 @@ utilized whenever possible to achieve the best performance**. In contrast to other graph databases, Memgraph deep path traversals efficiently handle complex graph queries, as these algorithms have been built into Memgraph's core. This eliminates the need for the overhead of business logic on -the application side. There are four built-in deep path traversal algorithms: -Depth-first search (DFS), Breadth-first search (BFS), Weighted Shortest Path and -All Shortest Paths. +the application side. There are five built-in deep path traversal algorithms: +Depth-first search (DFS), Breadth-first search (BFS), Weighted Shortest Path, +All Shortest Paths, and K Shortest Paths. (end:B) +RETURN p; +``` + +In Memgraph, you need to first match the source and target nodes, then use the `*KSHORTEST` syntax with a limit: + +```cypher +MATCH (start:A), (end:B) +WITH start, end +MATCH p=(start)-[:E *KSHORTEST | 3]->(end) +RETURN p; +``` + +Note that Memgraph requires both source and target nodes to be matched first using a `WITH` clause before applying the K-shortest paths algorithm. + ### NOT label expression In Neo4j, you can use the `NOT` label expression (`!`): ```cypher