-
Notifications
You must be signed in to change notification settings - Fork 797
runtime: Collect stake delegations only once during epoch activation #8065
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
2b1439a to
5525c4f
Compare
5525c4f to
5917723
Compare
b52cfbf to
3b50554
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #8065 +/- ##
========================================
Coverage 83.2% 83.2%
========================================
Files 863 863
Lines 373941 374132 +191
========================================
+ Hits 311207 311404 +197
+ Misses 62734 62728 -6 🚀 New features to boost your workflow:
|
3b50554 to
d5159c1
Compare
|
There is an issue with this PR for epoch_reward_cache. The PR moved the cache check after the computation. Before the PR, the cache was checked before computing rewards in calculate_rewards_and_distribute_vote_rewards. After the PR, the cache is only populated in save_rewards, which happens after the expensive computation. |
And your worry is that it will take more than one slot? Or is there something else you have in mind? To be precise - the computation you're talking about, currently takes around 50ms. And the entire epoch boundary after this change - 330ms. So I think we are fine. The overall goal of my optimizations here is to keep epoch boundary below one slot. |
Yes. we used to have many forks at epoch boundary. And the cache is introduced to avoid computing the rewards again at forks. If we are certain that there is going to be no forks, we can remove the cache. In this Pr, we store to the cache but never read from it. seems a waste. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work!
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this pretty difficult to review due to how large the diff is (~850 lines). Ideally you don't move functions around and refactor them in the same PR. And I think some of the changes aren't necessary, left comments for those.
| } | ||
|
|
||
| // Calculate rewards from previous epoch and distribute vote rewards | ||
| pub(in crate::bank) fn calculate_rewards_and_distribute_vote_rewards( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually distributing the vote rewards anymore, so this function should be renamed to something like calculate_and_cache_epoch_rewards. The CalculateRewardsAndDistributeVoteRewardsResult struct should be renamed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function didn't get renamed yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because I didn't end up splitting calculation and distribution in the current version of this code, I've realized that refactor was unnecessary. The distribution part is still there in the same method:
agave/runtime/src/bank/partitioned_epoch_rewards/calculation.rs
Lines 211 to 233 in 14c000f
| // verify that we didn't pay any more than we expected to | |
| assert!(point_value.rewards >= total_vote_rewards + total_stake_rewards_lamports); | |
| info!( | |
| "distributed vote rewards: {} out of {}, remaining {}", | |
| total_vote_rewards, point_value.rewards, total_stake_rewards_lamports | |
| ); | |
| let (num_stake_accounts, num_vote_accounts) = { | |
| let stakes = self.stakes_cache.stakes(); | |
| ( | |
| stakes.stake_delegations().len(), | |
| stakes.vote_accounts().len(), | |
| ) | |
| }; | |
| self.capitalization.fetch_add(total_vote_rewards, Relaxed); | |
| let active_stake = if let Some(stake_history_entry) = | |
| self.stakes_cache.stakes().history().get(prev_epoch) | |
| { | |
| stake_history_entry.effective | |
| } else { | |
| 0 | |
| }; |
And the only thing I had to move away in order to change &mut self to &self is the self.set_epoch_reward_status_calculation(distribution_starting_block_height, stake_rewards); call.
14c000f#diff-802ffb9b4536fc89679c0834342ccaa0d32bb482a85e846a444e491a93e684e5
(this commit is a self-contained change of mutability there)
Calling self.capitalization.fetch_add and logging active stake is still fine even after changing &mut self to &self.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm well you also split out Bank::store_vote_accounts_partitioned (also inside Bank::save_rewards) from Bank::calculate_rewards_and_distribute_vote_rewards so the core part of vote reward distribution is actually not in there. But I see your point about the other distribution code being in there still. Do you think we could move all of that code into Bank::save_rewards (maybe rename this to distribute_vote_rewards) so that all the code for vote reward distribution is in the same place?
Specifically:
Bank::update_vote_rewards- Capitalization update
And then Bank::create_epoch_rewards_sysvar can be called after Bank::save_rewards.
I don't care a lot about keeping the datapoints ("epoch_rewards" and "epoch-rewards-status-update") consistent but others may disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I 'm done with the split to Bank::calculate_rewards and Bank::distribute_vote_rewards: 9811963
And then
Bank::create_epoch_rewards_sysvarcan be called afterBank::save_rewards.
I tried to do that, but I didn't end up pushing that, because:
- Calling
Bank::create_epoch_rewards_sysvarinbegin_partitioned_rewardsworks (at that point, calculations are already done). Do you really feel like having it there is incorrect? - It needs two other variables that live inside
Bank::begin_partitioned_rewards-distribution_starting_block_heightandnum_partitions. - The diff is pretty noisy (I even had to change the publicity of
StakeRewardCalculationand import additional stuff) and doesn't seem to fit your "don't do unnecessary refactors" narrative. - After such extraction,
begin_partitioned_rewardsonly callscalculate_rewardsand logs datapoints. - It adds code to already big bank.rs.
See the diff: https://gist.github.com/vadorovsky/58917c7934b9636d3edce5d5cc38871e
If you really want me to extract that code, perhaps creating a yet another method inside calculation.rs would be the way. that is going to be less noisy, but also, in such case, we might think about removing Bank::begin_partitioned_rewards and doing everything in Bank::calculate_rewards.
I don't care a lot about keeping the datapoints (
"epoch_rewards"and"epoch-rewards-status-update") consistent but others may disagree.
I kept all datapoints from the former calculate_rewards_and_distribute_vote_rewards, but split across the two methods.
In general, in this PR, I removed one datapoint - about filtering delegations - because now we just return a lazy iterator and we filter on each iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling Bank::create_epoch_rewards_sysvar in begin_partitioned_rewards works (at that point, calculations are already done). Do you really feel like having it there is incorrect?
My suggestion wasn't about correctness, just about making the code easier to understand. I think it's better to do read-only calculation changes separately from the changes that modify state. Methods named with "compute" and "calculate" don't seem like they would have side effects like creating a sysvar for example.
I put together a commit with some suggested refactorings here: jstarry@ed944e0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalization update needs to be fixed and should have a test
8b44339 to
01c5aac
Compare
01b54bd to
d893331
Compare
0723476 to
ef1f93b
Compare
That allows calling the method without ownership over `PointValue`. Copies of the integers from the structure are cheap. Currently, the validator code calling the method has an owned `PointerValue` instance; but that's going to change in the following changes.
Before this change, `Stakes::activate_epoch` was performing calculations and mutating the cache at the same time. The latter was the reason why it was taking `&mut self` and acquiring a write lock on the cache. But the calculations themselves don't require mutability and a write lock. Split out the new `Stakes::calculate_activated_stake` method that performs only calculations, needs only an immutable `&self` and a read lock. Then reduce the scope of `Stake::activate_epoch` just to assigning the values computed by `calculate_activated_stake`. Also, add the new `Bank::compute_new_epoch_caches_and_rewards` method, called in `Bank::process_new_epoch`, that holds a read lock on the stakes cache.
Stake delegations are stored as hash array mapped trie (HAMT)[0], which means that inserts, deletions and lookups are average-case O(1) and worst-case O(log n). However, the performance of iterations is poor due to depth-first traversal and jumps. Currently it's also impossible to iterate over it with rayon. That issue is known and handled by converting the HAMT to a vector with `stakes.stake_delegations.iter().collect()`. Move that trick to a dedicated method that describes the performance consequences. [0] https://en.wikipedia.org/wiki/Hash_array_mapped_trie
`filter_stake_delegations` was always collecting the stake delegations. To allow re-using already existing vectors or slices of stake delegations, add `FilteredStakeDelegation` type that wraps a `Cow` of stake delegations and provides a parallel iterator that yields elements, filtering them by the minimum stake. Note that the wrapper yields an `Option`. That makes it possible to know the size of the iterator without collecting elements. That property is critical for the ability to pre-allocate data structures that contain processed stake delegations. However, it adds a necessity of handling `None` elements while iterating.
ef1f93b to
8ec02be
Compare
| } | ||
|
|
||
| // Calculate rewards from previous epoch and distribute vote rewards | ||
| pub(in crate::bank) fn calculate_rewards_and_distribute_vote_rewards( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm well you also split out Bank::store_vote_accounts_partitioned (also inside Bank::save_rewards) from Bank::calculate_rewards_and_distribute_vote_rewards so the core part of vote reward distribution is actually not in there. But I see your point about the other distribution code being in there still. Do you think we could move all of that code into Bank::save_rewards (maybe rename this to distribute_vote_rewards) so that all the code for vote reward distribution is in the same place?
Specifically:
Bank::update_vote_rewards- Capitalization update
And then Bank::create_epoch_rewards_sysvar can be called after Bank::save_rewards.
I don't care a lot about keeping the datapoints ("epoch_rewards" and "epoch-rewards-status-update") consistent but others may disagree.
ee4584c to
1d141dd
Compare
`Bank::begin_partitioned_rewards` was taking `&mut self`, even though all the calculations done by it do not require mutability and the only LOC requiring it was a call to `set_epoch_reward_status_calculation`. `Bank::calculate_rewards_and_distribute_vote_rewards` was calling `Bank::store_vote_accounts_partitioned` that acquires a write lock on stakes cache. To make a clear boundary between the code that: * Does not mutate and acquires only read locks. * Mutates and acquires write locks. Split the mentioned code into two methods: * `Bank::calculate_rewards` - takes `&self`, does not acquire any locks. However, we are going to pass a slice of delegations, that acquires a read lock on status cache, in the follow up changes there. * `Bank::distribute_vote_rewards` - takes `&mut self` and acquires a write lock on status cache.
…ake` `Bank::calculate_activated_stake` was collecting stake delegations to a vector internally. That prevents us from re-using it in the latter parts of epoch boundary. Fix that by taking a slice of stake delegations.
Processing new epoch (`Bank::process_new_epoch`) involves collecting stake delegations twice: 1) In `Bank::compute_new_epoch_caches_and_rewards`, to create a stake history entry and refresh vote accounts. 2) In `Bank::get_epoch_reward_calculate_param_info`, which is then used in `Bank::calculate_stake_vote_rewards` to calculate rewards for stakers and voters. Reduce that to just one collect by passing the vector 1) with freshly computed stake history and vote accounts to `Bank::begin_partitioned_rewards`. This way, we can avoid calling `Bank::get_epoch_reward_calculate_param_info`.
This method is now used only for recalculations after snapshot restore. Change the name to `get_epoch_params_for_recalculation` and adjust its documentation.
1d141dd to
4f7bb25
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me. I added a comment with some more suggested refactorings but this is fine as is already. Nice work!
Problem
Processing new epoch (
Bank::process_new_epoch) involves collecting stake delegations twice:Stakes::activate_epoch, to create a stake history entry and refresh vote accounts.Bank::filter_stake_delegations, which is then used inBank::calculate_stake_vote_rewardsto calculate rewards for stakers and voters.The overall time of crossing the epoch boundary is ~519ms:
Where the two heaviest operations are
collect()calls on stake delegations, each of them taking ~200-220ms:Summary of Changes
Reduce that to just one collect to a
Vec<(&Pubkey, &StakeAccount)>done on the beginning ofBank::process_new_epochand passing the stake delegations to the other methods.That vector holds references to the stake cache information, which stays behind read-write lock. To make sure that we hold only read lock for the lifetime of the vector, split all operations requiring mutation of
Bankand acquiring a write lock out and perform them after dropping the read lock.In summary, the new order of operations done in
Bank::process_new_epochis:stakes_cache.a)
Bank::begin_epoch_activation(a new method) that returns updatedstake_historyandvote_accounts(without modyfing theBankyet).b)
Bank::calculate_rewards_and_distribute_vote_rewards, which computes the stake rewards. Passstake_history,stake_delegationsandvote_accounts. Keep the result asrewards_result.stakes_cache.a)
StakesCache::activate_epoch, which now instead of performing any calculations, simply takes the previously computedstake_historyandvote_accountsand assigns them.Bank::save_rewards(a new method) that takesrewards_resultand uses it to updateepoch_reward_statusand theEpochRewardssysvar.The new time of crossing the epoch boundary is ~337ms:
There is only one heavy
collect()done on stake delegations, which still takes the most of main thread's time. But that's the best we can do while still usingim::HashMap.Fixes: #8282