Added RWLock on committed state #599

Shubham8287 · 2023-11-27T20:36:42Z

Description of Changes

Removed mutex Locking on wrapper state object (Inner) and added more granual locks on indivisual states.
Introduced of RWlock on CommittedState.
Moved implemented methods from Inner to MutTxId struct.
Made bootstrap only dependent on CommittedState.

API and ABI breaking changes

If this is an API or ABI breaking change, please apply the
corresponding GitHub label.

Expected complexity level and risk

How complicated do you think these changes are? Grade on a scale from 1 to 5,
where 1 is a trivial change, and 5 is a deep-reaching and complex change.

This complexity rating applies not only to the complexity apparent in the diff,
but also to its interactions with existing and future code.

If you answered more than a 2, explain what is complex about the PR,
and what other components it interacts with in potentially concerning ways.

joshua-spacetime · 2023-11-27T23:22:01Z

Are any updates to the threading model needed here?

gefjon

I gave up commenting on the unwraps, but I'd like you to search through this change for every unwrap call, and:

If you believe it never panics based only on local reasoning, add a comment describing why.
If the caller must uphold some invariant to avoid a panic, add them as a doc comment to the function, and add a comment to the unwrap describing why those invariants are sufficient to avoid a panic.

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

gefjon · 2023-11-28T16:22:29Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

        Ok(())
    }
+    //TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here.
+    // This iter has only been implemented to use during bootstrap


It's not obvious to me why this is only used during bootstrap; could you add a comment describing who calls this, and what non-bootstrap paths do instead?

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

gefjon · 2023-11-28T16:38:13Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

        row.encode(&mut bytes);
        let data_key = DataKey::from_data(&bytes);
        let row_id = RowId(data_key);
+        let tx_state = self.tx_state_lock.as_mut().unwrap();


Ditto, comment on enclosing function re: panic safety, or local justification why this will not panic.

gefjon · 2023-11-28T16:38:22Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

            table
        } else {
-            let Some(committed_table) = self.committed_state.tables.get(&table_id) else {
+            let Some(committed_table) = self.committed_state_write_lock.as_ref().unwrap().tables.get(&table_id) else {


Panic safety.

gefjon · 2023-11-28T16:39:52Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

+        drop(sequence_state);
+        drop(commit_state);


Don't these get dropped automatically? Why do we need to explicitly call drop?

As we are looking to return datastore, which requires to drop all the borrowers first.

I wouldn't expect that to be the case - Rust is usually able to infer when borrows can/should/must end and insert drops automatically as appropriate.

You are right but unexpectly for some reason, Compiler is not able to infer sequence_state (type MutexGaurd) last usage and expecting it to drop only when scope ends.
Though, I am able to get rid of explicit drop by not binding sequence_state, but looks something interesting happening here.

Shubham8287 · 2023-11-28T18:07:46Z

Replied to few, I feel I need to add few more comments in code, will do that. Thanks to give a first look @gefjon.

gefjon · 2023-11-29T15:58:35Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

+/// Struct contains various database states, each protected by
+/// their own lock. To avoid deadlocks, it is crucial to acquire these locks
+/// in a consistent order throughout the application.
+///
+/// Lock Acquisition Order:
+/// 1. `memory`
+/// 2. `committed_state`
+/// 3. `tx_state`
+/// 4. `sequence_state`
+///
+/// All locking mechanisms are encapsulated within the struct through local methods.


Shubham8287 · 2023-11-29T16:23:49Z

I gave up commenting on the unwraps, but I'd like you to search through this change for every unwrap call, and:

If you believe it never panics based only on local reasoning, add a comment describing why.

If the caller must uphold some invariant to avoid a panic, add them as a doc comment to the function, and add a comment to the unwrap describing why those invariants are sufficient to avoid a panic.

@gefjon unwrap() has been used only for tx_state in MutTxId, and tx_state must not be None during whole transaction journey. So, It will never going to panic.

I feel it will be redundant to add comment for it on every method but also didn't find any good way to state this in comments, though I made an attempt here https://github.com/clockworklabs/SpacetimeDB/pull/599/files#diff-92219f25d92114287cc7ccb95ba7bae63f6172afd393de14df7a44968b7fa952R125 on MuTxId, Let me know if you have better Idea :)

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

Shubham8287 · 2023-11-29T19:07:50Z

Are any updates to the threading model needed here?

@joshua-spacetime, this PR doesn't make any funcational changes, so No. May be my next part of "read write lock" would require them, Can you point me to code to understand bit about threading model.

gefjon

Great, thanks!

cloutiertyler

It generally looks great. More or less exactly what I was looking for. I have a few comments inline. The most important one is that I think the Option can be removed around TxState now that it's no longer stored in the Inner struct.

Please see this PR where I pulled it out. I'm 90% sure that this should work, but the 10% is because there's one place in iter_by_col_range that actually cared whether it was None or not. Could you confirm that my PR makes sense or that it doesn't?

cloutiertyler · 2023-11-30T03:08:43Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

+///
+/// The initialization of this struct is sensitive because improper
+/// handling can lead to deadlocks. Therefore, it is strongly recommended to use
+///`Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks.


Suggested change

///`Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks.

/// `Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks.

cloutiertyler · 2023-11-30T03:09:25Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

+/// The initialization of this struct is sensitive because improper
+/// handling can lead to deadlocks. Therefore, it is strongly recommended to use
+///`Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks.
+/// `tx_state_lock` should remain `Some` throughout the transaction's lifecycle, until `commit()` or `rollback()` is called.


This is great. Thanks for adding this comment.

cloutiertyler · 2023-11-30T03:16:18Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs


+    // TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here.
+    /// Iter for`CommittedState`, Only to be used during bootstrap.
+    /// For transcation, consider using MutTxId::Iters.


Suggested change

/// For transcation, consider using MutTxId::Iters.

/// For transaction, consider using MutTxId::Iters.

cloutiertyler · 2023-11-30T03:18:43Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

        Ok(())
    }

+    // TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here.


I'm not sure what this comment means. I'm assuming that this will do the equality check by scanning rather than using an index, but we only use this during bootstrapping, so the question is is it worth it to add indexes here?

My doubt is, If its possible to use Indexes during bootstrap we should do that, If we don't generate indexes by this time during bootstrap, we are fine with scanning. I just need to verify it, so added a TODO.

I have another thought back of mind that we may be using this iter for read only transactions as well, then we have to add the indexes but that depends on the approach, little early to think about that here.

You might want to read this code again after #596 lands which refactors bootstrapping and hopefully makes it clearer.

As is now more clear after my PR, bootstrapping is a list of steps:

pub(crate) fn system_tables() -> [TableSchema; 6] { [ st_table_schema(), st_columns_schema(), st_indexes_schema(), st_constraints_schema(), st_module_schema(), // Is important this is always last, so the starting sequence for each // system table is correct. st_sequences_schema(), ] }

... where each part of the system is progressively added to the DB environment. It means is partially "broken/unfinished" until the last step is executed.

Ideally, It should be PartialDB -> Db but this will duplicate some code.

About indexes, they are incomplete, so is bad idea to rely on them. They need a "dehydrate" step:

// Create the system tables and insert information about themselves into datastore.bootstrap_system_tables()?; // The database tables are now initialized with the correct data. // Now we have to build our in-memory structures. datastore.build_sequence_state()?; datastore.build_indexes()?;

(Now I think this should be moved inside datastore.bootstrap_system_tables)

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

cloutiertyler · 2023-11-30T04:19:35Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

 pub struct MutTxId {
-    lock: ArcMutexGuard<RawMutex, Inner>,
+    committed_state_write_lock: SharedWriteGuard<CommittedState>,
+    tx_state_lock: SharedMutexGuard<Option<TxState>>,


This no longer needs to be Option now that it lives inside the MutTxId itself. It's not possible for there to be a MutTxId unless there's a transaction in process.

Shubham8287 · 2023-11-30T06:16:58Z

I think the Option can be removed around TxState now that it's no longer stored in the Inner struct.

I had the same thoughts, which lead me to start discussion about dropping transaction object after commit or rollback to be sure about its lifecycle.

Please see this PR where I pulled it out. I'm 90% sure that this should work, but the 10% is because there's one place in iter_by_col_range that actually cared whether it was None or not. Could you confirm that my PR makes sense or that it doesn't?

None handling is only for bootstrapping, this branch implements CommittedState:schema_for_tablei to handle that, hence it should be fine.

let's move this conversaiton to your PR :)

…ed (#617) * Removed the tx_state lock from the MutTxId, since it's no longer needed. * cargo fmt

kulakowski · 2023-11-30T15:46:09Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

+///
+/// The initialization of this struct is sensitive because improper
+/// handling can lead to deadlocks. Therefore, it is strongly recommended to use
+/// `Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks.


I'm fine with this for the moment.

In the long run, presumably we have finer grained locks so as to not take them all, every time. I think it's worthwhile to write a proposal about what operations will operate on which bits of state, and what strategies we will use to ensure we don't take locks in the wrong order as things get more complicated, and what strategies we will use to detect when we mess it up.

I'm particularly interested in the controldb which is one plausible way to have "surprising" reentrancy.

kulakowski · 2023-11-30T15:59:07Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

        Ok(())
    }

+    // TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here.


You might want to read this code again after #596 lands which refactors bootstrapping and hopefully makes it clearer.

kulakowski · 2023-11-30T16:01:56Z

crates/core/src/db/datastore/locking_tx_datastore/mod.rs

    }

-    fn create_sequence(&mut self, seq: SequenceDef) -> super::Result<SequenceId> {
+    fn create_sequence(&mut self, seq: SequenceDef, database_address: Address) -> super::Result<SequenceId> {


For next time: I think we could have factored this PR in two. Some of these changes are very mechanical, and IMO could have landed separately first. I'm always going to ask for more and smaller PRs.

Shubham8287 · 2023-12-01T05:51:41Z

Blocked on https://github.com/clockworklabs/SpacetimeDB/pull/596/files to get merge, to avoid conflicts.

cloutiertyler

This now LGTM.

mamcx

LGTM

Shubham8287 added 7 commits November 22, 2023 11:45

added read write lock

3dd980e

disable read lock

fc9d8e2

set bootstrap out of txn

5db0982

removed read instance

8595f60

Merge branch 'master' into shub/rw-lock

a00fd1f

iter for commited state

bf02430

added comments

3b12fec

Shubham8287 force-pushed the shub/rw-lock branch from c79155d to 3b12fec Compare November 27, 2023 20:46

lint

9a2b6b7

gefjon requested changes Nov 28, 2023

View reviewed changes

removed inner Struct

34c33cf

Shubham8287 force-pushed the shub/rw-lock branch 2 times, most recently from 84cbe1a to 0cfb632 Compare November 29, 2023 12:17

Shubham8287 added 2 commits November 29, 2023 13:38

removed uneccesary option from commited_state_lock

794604d

added type alias and deadlock safety comment

d29cac2

gefjon reviewed Nov 29, 2023

View reviewed changes

kulakowski reviewed Nov 29, 2023

View reviewed changes

crates/core/src/db/datastore/locking_tx_datastore/mod.rs Show resolved Hide resolved

Shubham8287 added 2 commits November 29, 2023 17:39

added more comment

90ae921

Merge branch 'master' into shub/rw-lock

0cfb632

Shubham8287 requested a review from gefjon November 29, 2023 18:41

gefjon approved these changes Nov 29, 2023

View reviewed changes

Shubham8287 added 2 commits November 29, 2023 21:45

lint

72f0d04

removed unneccessary drop

715916c

cloutiertyler requested changes Nov 30, 2023

View reviewed changes

Shubham8287 requested a review from cloutiertyler November 30, 2023 09:10

Shubham8287 and others added 3 commits November 30, 2023 12:14

nit pick

93d1b7f

Removed the tx_state lock from the MutTxId, since it's no longer need…

4454ef5

…ed (#617) * Removed the tx_state lock from the MutTxId, since it's no longer needed. * cargo fmt

adjusted comment as we removed Option from tx_state

170b19d

kulakowski reviewed Nov 30, 2023

View reviewed changes

Shubham8287 added 2 commits December 5, 2023 03:21

Merge branch 'master' into shub/rw-lock

585dbc3

fix nit

5499d09

cloutiertyler approved these changes Dec 5, 2023

View reviewed changes

kulakowski added the release-0.8 label Dec 5, 2023

mamcx approved these changes Dec 6, 2023

View reviewed changes

Shubham8287 merged commit ccf291b into master Dec 6, 2023

joshua-spacetime mentioned this pull request Apr 9, 2024

Implement the concept of a read-only transaction for RelationalDB so that we don’t need to track write skew #1066

Closed

Shubham8287 deleted the shub/rw-lock branch July 29, 2025 05:29

	///`Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks.
	/// `Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks.

	/// For transcation, consider using MutTxId::Iters.
	/// For transaction, consider using MutTxId::Iters.

Added RWLock on committed state #599

Added RWLock on committed state #599

Uh oh!

Conversation

Shubham8287 commented Nov 27, 2023

Description of Changes

API and ABI breaking changes

Expected complexity level and risk

Uh oh!

joshua-spacetime commented Nov 27, 2023

Uh oh!

gefjon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 commented Nov 28, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 commented Nov 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Shubham8287 commented Nov 29, 2023

Uh oh!

gefjon left a comment

Choose a reason for hiding this comment

Uh oh!

cloutiertyler left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 Nov 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 commented Nov 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 commented Dec 1, 2023

Uh oh!

Shubham8287 commented Nov 29, 2023 •

edited

Loading

Shubham8287 Nov 30, 2023 •

edited

Loading

Shubham8287 commented Nov 30, 2023 •

edited

Loading