-
Notifications
You must be signed in to change notification settings - Fork 645
Added RWLock on committed state #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c79155d to
3b12fec
Compare
|
Are any updates to the threading model needed here? |
gefjon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave up commenting on the unwraps, but I'd like you to search through this change for every unwrap call, and:
- If you believe it never panics based only on local reasoning, add a comment describing why.
- If the caller must uphold some invariant to avoid a panic, add them as a doc comment to the function, and add a comment to the unwrap describing why those invariants are sufficient to avoid a panic.
| Ok(()) | ||
| } | ||
| //TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here. | ||
| // This iter has only been implemented to use during bootstrap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me why this is only used during bootstrap; could you add a comment describing who calls this, and what non-bootstrap paths do instead?
| row.encode(&mut bytes); | ||
| let data_key = DataKey::from_data(&bytes); | ||
| let row_id = RowId(data_key); | ||
| let tx_state = self.tx_state_lock.as_mut().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, comment on enclosing function re: panic safety, or local justification why this will not panic.
| table | ||
| } else { | ||
| let Some(committed_table) = self.committed_state.tables.get(&table_id) else { | ||
| let Some(committed_table) = self.committed_state_write_lock.as_ref().unwrap().tables.get(&table_id) else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Panic safety.
| drop(sequence_state); | ||
| drop(commit_state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't these get dropped automatically? Why do we need to explicitly call drop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we are looking to return datastore, which requires to drop all the borrowers first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't expect that to be the case - Rust is usually able to infer when borrows can/should/must end and insert drops automatically as appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right but unexpectly for some reason, Compiler is not able to infer sequence_state (type MutexGaurd) last usage and expecting it to drop only when scope ends.
Though, I am able to get rid of explicit drop by not binding sequence_state, but looks something interesting happening here.
|
Replied to few, I feel I need to add few more comments in code, will do that. Thanks to give a first look @gefjon. |
84cbe1a to
0cfb632
Compare
| /// Struct contains various database states, each protected by | ||
| /// their own lock. To avoid deadlocks, it is crucial to acquire these locks | ||
| /// in a consistent order throughout the application. | ||
| /// | ||
| /// Lock Acquisition Order: | ||
| /// 1. `memory` | ||
| /// 2. `committed_state` | ||
| /// 3. `tx_state` | ||
| /// 4. `sequence_state` | ||
| /// | ||
| /// All locking mechanisms are encapsulated within the struct through local methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this!
@gefjon I feel it will be redundant to add comment for it on every method but also didn't find any good way to state this in comments, though I made an attempt here https://github.com/clockworklabs/SpacetimeDB/pull/599/files#diff-92219f25d92114287cc7ccb95ba7bae63f6172afd393de14df7a44968b7fa952R125 on |
@joshua-spacetime, this PR doesn't make any funcational changes, so No. May be my next part of "read write lock" would require them, Can you point me to code to understand bit about threading model. |
gefjon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
cloutiertyler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It generally looks great. More or less exactly what I was looking for. I have a few comments inline. The most important one is that I think the Option can be removed around TxState now that it's no longer stored in the Inner struct.
Please see this PR where I pulled it out. I'm 90% sure that this should work, but the 10% is because there's one place in iter_by_col_range that actually cared whether it was None or not. Could you confirm that my PR makes sense or that it doesn't?
| /// | ||
| /// The initialization of this struct is sensitive because improper | ||
| /// handling can lead to deadlocks. Therefore, it is strongly recommended to use | ||
| ///`Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ///`Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks. | |
| /// `Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks. |
| /// The initialization of this struct is sensitive because improper | ||
| /// handling can lead to deadlocks. Therefore, it is strongly recommended to use | ||
| ///`Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks. | ||
| /// `tx_state_lock` should remain `Some` throughout the transaction's lifecycle, until `commit()` or `rollback()` is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Thanks for adding this comment.
|
|
||
| // TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here. | ||
| /// Iter for`CommittedState`, Only to be used during bootstrap. | ||
| /// For transcation, consider using MutTxId::Iters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// For transcation, consider using MutTxId::Iters. | |
| /// For transaction, consider using MutTxId::Iters. |
| Ok(()) | ||
| } | ||
|
|
||
| // TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what this comment means. I'm assuming that this will do the equality check by scanning rather than using an index, but we only use this during bootstrapping, so the question is is it worth it to add indexes here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My doubt is, If its possible to use Indexes during bootstrap we should do that, If we don't generate indexes by this time during bootstrap, we are fine with scanning. I just need to verify it, so added a TODO.
I have another thought back of mind that we may be using this iter for read only transactions as well, then we have to add the indexes but that depends on the approach, little early to think about that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to read this code again after #596 lands which refactors bootstrapping and hopefully makes it clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As is now more clear after my PR, bootstrapping is a list of steps:
pub(crate) fn system_tables() -> [TableSchema; 6] {
[
st_table_schema(),
st_columns_schema(),
st_indexes_schema(),
st_constraints_schema(),
st_module_schema(),
// Is important this is always last, so the starting sequence for each
// system table is correct.
st_sequences_schema(),
]
}... where each part of the system is progressively added to the DB environment. It means is partially "broken/unfinished" until the last step is executed.
Ideally, It should be PartialDB -> Db but this will duplicate some code.
About indexes, they are incomplete, so is bad idea to rely on them. They need a "dehydrate" step:
// Create the system tables and insert information about themselves into
datastore.bootstrap_system_tables()?;
// The database tables are now initialized with the correct data.
// Now we have to build our in-memory structures.
datastore.build_sequence_state()?;
datastore.build_indexes()?;(Now I think this should be moved inside datastore.bootstrap_system_tables)
| pub struct MutTxId { | ||
| lock: ArcMutexGuard<RawMutex, Inner>, | ||
| committed_state_write_lock: SharedWriteGuard<CommittedState>, | ||
| tx_state_lock: SharedMutexGuard<Option<TxState>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This no longer needs to be Option now that it lives inside the MutTxId itself. It's not possible for there to be a MutTxId unless there's a transaction in process.
I had the same thoughts, which lead me to start discussion about dropping
let's move this conversaiton to your PR :) |
…ed (#617) * Removed the tx_state lock from the MutTxId, since it's no longer needed. * cargo fmt
| /// | ||
| /// The initialization of this struct is sensitive because improper | ||
| /// handling can lead to deadlocks. Therefore, it is strongly recommended to use | ||
| /// `Locking::begin_mut_tx()` for instantiation to ensure safe acquisition of locks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this for the moment.
In the long run, presumably we have finer grained locks so as to not take them all, every time. I think it's worthwhile to write a proposal about what operations will operate on which bits of state, and what strategies we will use to ensure we don't take locks in the wrong order as things get more complicated, and what strategies we will use to detect when we mess it up.
I'm particularly interested in the controldb which is one plausible way to have "surprising" reentrancy.
| Ok(()) | ||
| } | ||
|
|
||
| // TODO(shubham): Need to confirm, if indexes exist during bootstrap to be used here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to read this code again after #596 lands which refactors bootstrapping and hopefully makes it clearer.
| } | ||
|
|
||
| fn create_sequence(&mut self, seq: SequenceDef) -> super::Result<SequenceId> { | ||
| fn create_sequence(&mut self, seq: SequenceDef, database_address: Address) -> super::Result<SequenceId> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For next time: I think we could have factored this PR in two. Some of these changes are very mechanical, and IMO could have landed separately first. I'm always going to ask for more and smaller PRs.
|
Blocked on https://github.com/clockworklabs/SpacetimeDB/pull/596/files to get merge, to avoid conflicts. |
cloutiertyler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now LGTM.
mamcx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description of Changes
Inner) and added more granual locks on indivisual states.CommittedState.InnertoMutTxIdstruct.API and ABI breaking changes
If this is an API or ABI breaking change, please apply the
corresponding GitHub label.
Expected complexity level and risk
How complicated do you think these changes are? Grade on a scale from 1 to 5,
where 1 is a trivial change, and 5 is a deep-reaching and complex change.
This complexity rating applies not only to the complexity apparent in the diff,
but also to its interactions with existing and future code.
If you answered more than a 2, explain what is complex about the PR,
and what other components it interacts with in potentially concerning ways.