-
Notifications
You must be signed in to change notification settings - Fork 933
Use events API to eager send attestations #7892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
| beacon_nodes: &Arc<BeaconNodeFallback<T>>, | ||
| ) -> Result<(), String> { | ||
| beacon_nodes | ||
| .first_success(|beacon_node| async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related, this function is not parallel currently, so this function likely won't work very well if the first BN takes too long:
An alternative would be to maintain event streams from all BNs, and trigger the attestation process as soon as we receive a head event from one of them. However this also has an issue in that we might use a BN to create the attestation that hasn't yet imported the block. E.g. if we have BNs bn1,bn2,bn3 and we get a head event from bn2, then we'll try to attest using bn1, and build an attestation to the previous block because bn1 hasn't imported the new head yet.
This also raises a related issue, which is that we need to keep track of the slot of the head event, because the BNs could be lagging and sending us head events for old slots.
A new design that handles both of these problems might be:
- Maintain
headevent streams from all BNs. - Begin the attestation process once a
headevent arrives that is superior to our current best known head (need to keep track of this). - Option 1: Just try to attest using the BN(s) that sent you the head event (probably just 1 BN). Or, if we track the latest head on each BN in the background, some more BNs might be in this category.
- Option 2: Request attestations from all BNs, and only accept ones that match the latest known head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with the Option 1 but with an extra fail-safe branch to origin first_success if it encounters a race condition with the BN index that sent the latest head
validator_client/validator_services/src/head_monitor_service.rs
Outdated
Show resolved
Hide resolved
validator_client/validator_services/src/head_monitor_service.rs
Outdated
Show resolved
Hide resolved
validator_client/validator_services/src/head_monitor_service.rs
Outdated
Show resolved
Hide resolved
|
Some required checks have failed. Could you please take a look @hopinheimer? 🙏 |
while working on this #7892 @michaelsproul pointed it might be a good metric to measure the delay from start of the slot instead of the current `slot_duration / 3`, since the attestations duties start before the `1/3rd` mark now with the change in the link PR. Co-Authored-By: hopinheimer <[email protected]> Co-Authored-By: hopinheimer <[email protected]>
validator_client/beacon_node_fallback/src/beacon_head_monitor.rs
Outdated
Show resolved
Hide resolved
validator_client/beacon_node_fallback/src/beacon_head_monitor.rs
Outdated
Show resolved
Hide resolved
* Optimise `state_root_at_slot` for finalized slot (sigp#8353) This is an optimisation targeted at Fulu networks in non-finality. While debugging on Holesky, we found that `state_root_at_slot` was being called from `prepare_beacon_proposer` a lot, for the finalized state: https://github.com/sigp/lighthouse/blob/2c9b670f5d313450252c6cb40a5ee34802d54fef/beacon_node/http_api/src/lib.rs#L3860-L3861 This was causing `prepare_beacon_proposer` calls to take upwards of 5 seconds, sometimes 10 seconds, because it would trigger _multiple_ beacon state loads in order to iterate back to the finalized slot. Ideally, loading the finalized state should be quick because we keep it cached in the state cache (technically we keep the split state, but they usually coincide). Instead we are computing the finalized state root separately (slow), and then loading the state from the cache (fast). Although it would be possible to make the API faster by removing the `state_root_at_slot` call, I believe it's simpler to change `state_root_at_slot` itself and remove the footgun. Devs rightly expect operations involving the finalized state to be fast. Co-Authored-By: Michael Sproul <[email protected]> * Remove Windows CI jobs (sigp#8362) Remove all Windows-related CI jobs Co-Authored-By: antondlr <[email protected]> * Update proposer-only section in the documentation (sigp#8358) Co-Authored-By: Tan Chee Keong <[email protected]> Co-Authored-By: Michael Sproul <[email protected]> * Fix unaggregated delay metric (sigp#8366) while working on this sigp#7892 @michaelsproul pointed it might be a good metric to measure the delay from start of the slot instead of the current `slot_duration / 3`, since the attestations duties start before the `1/3rd` mark now with the change in the link PR. Co-Authored-By: hopinheimer <[email protected]> Co-Authored-By: hopinheimer <[email protected]> * Downgrade and remove unnecessary logs (sigp#8367) ### Downgrade a non error to `Debug` I noticed this error on one of our hoodi nodes: ``` Nov 04 05:13:38.892 ERROR Error during data column reconstruction block_root: 0x4271b9efae7deccec3989bd2418e998b83ce8144210c2b17200abb62b7951190, error: DuplicateFullyImported(0x4271b9efae7deccec3989bd2418e998b83ce8144210c2b17200abb62b7951190) ``` This shouldn't be logged as an error and it's due to a normal race condition, and it doesn't impact the node negatively. ### Remove spammy logs This logs is filling up the log files quite quickly and it is also something we'd expect during normal operation - getting columns via EL before gossip. We haven't found this debug log to be useful, so I propose we remove it to avoid spamming debug logs. ``` Received already available column sidecar. Ignoring the column sidecar ``` In the process of removing this, I noticed we aren't propagating the validation result, which I think we should so I've added this. The impact should be quite minimal - the message will stay in the gossip memcache for a bit longer but should be evicted in the next heartbeat. Co-Authored-By: Jimmy Chen <[email protected]> * Prepare `sensitive_url` for `crates.io` (sigp#8223) Another good candidate for publishing separately from Lighthouse is `sensitive_url` as it's a general utility crate and not related to Ethereum. This PR prepares it to be spun out into its own crate. I've made the `full` field on `SensitiveUrl` private and instead provided an explicit getter called `.expose_full()`. It's a bit ugly for the diff but I prefer the explicit nature of the getter. I've also added some extra tests and doc strings along with feature gating `Serialize` and `Deserialize` implementations behind the `serde` feature. Co-Authored-By: Mac L <[email protected]> * Remove ecdsa feature of libp2p (sigp#8374) This compiles, is there any reason to keep `ecdsa`? CC @jxs Co-Authored-By: Michael Sproul <[email protected]> * CI workflows to use warpbuild ci runner (sigp#8343) Self hosted GitHub Runners review and improvements local testnet workflow now uses warpbuild ci runner Co-Authored-By: lemon <[email protected]> Co-Authored-By: antondlr <[email protected]> * Remove `sensitive_url` and import from `crates.io` (sigp#8377) Use the recently published `sensitive_url` and remove it from Lighthouse Co-Authored-By: Mac L <[email protected]> * Migrate derivative to educe (sigp#8125) Fixes sigp#7001. Mostly mechanical replacement of `derivative` attributes with `educe` ones. ### **Attribute Syntax Changes** ```rust // Bounds: = "..." → (...) #[derivative(Hash(bound = "E: EthSpec"))] #[educe(Hash(bound(E: EthSpec)))] // Ignore: = "ignore" → (ignore) #[derivative(PartialEq = "ignore")] #[educe(PartialEq(ignore))] // Default values: value = "..." → expression = ... #[derivative(Default(value = "ForkName::Base"))] #[educe(Default(expression = ForkName::Base))] // Methods: format_with/compare_with = "..." → method(...) #[derivative(Debug(format_with = "fmt_peer_set_as_len"))] #[educe(Debug(method(fmt_peer_set_as_len)))] // Empty bounds: removed entirely, educe can infer appropriate bounds #[derivative(Default(bound = ""))] #[educe(Default)] // Transparent debug: manual implementation (educe doesn't support it) #[derivative(Debug = "transparent")] // Replaced with manual Debug impl that delegates to inner field ``` **Note**: Some bounds use strings (`bound("E: EthSpec")`) for superstruct compatibility (`expected ','` errors). Co-Authored-By: Javier Chávarri <[email protected]> Co-Authored-By: Mac L <[email protected]> * Fix flaky reconstruction test (sigp#8321) FIx flaky tests that depends on timing. Previously the test processes all 128 columns and expect reconstruction to happen after all columns are processed. There is a race here, and reconstruction could be triggered before all columns are processed. I've updated the tests to process 64 columns, just enough for reconstruction and wait for 50ms for reconstruction to be triggered. This PR requires the change made in sigp#8194 for the test to pass consistently (blob count set to 1 for all blocks instead of random blob count between 0..max) Co-Authored-By: Jimmy Chen <[email protected]> Co-Authored-By: Jimmy Chen <[email protected]> * Remove `ethers-core` from `execution_layer` (sigp#8149) sigp#6022 Use `alloy_rpc_types::Transaction` to replace the `ethers_core::Transaction` inside the execution block generator. Co-Authored-By: Mac L <[email protected]> * Include block root in publish block logs (sigp#8111) Debugging sigp#8104 it would have been helpful to quickly see in the logs that a specific block was submitted into the HTTP API. Because we want to optimize the block root computation we don't include it in the logs, and just log the block slot. I believe we can take a minute performance hit to have the block root in all the logs during block publishing. Co-Authored-By: dapplion <[email protected]> Co-Authored-By: Jimmy Chen <[email protected]> * fix: clarify `bb` vs `bl` variable names in BeaconProcessorQueue (sigp#8315) since block and blob both start with `bl`, it was not clear how to differentiate between `blbroots_queue` and `bbroots_queue` After renaming, there also seems to be a discrepancy Co-Authored-By: Kevaundray Wedderburn <[email protected]> * Migrate the `deposit_contract` crate to `alloy` (sigp#8139) sigp#6022 Switches the `deposit_contract` crate to use the `alloy` ecosystem and removes the dependency on `ethabi` Co-Authored-By: Mac L <[email protected]> --------- Co-authored-by: Michael Sproul <[email protected]> Co-authored-by: Michael Sproul <[email protected]> Co-authored-by: antondlr <[email protected]> Co-authored-by: chonghe <[email protected]> Co-authored-by: hopinheimer <[email protected]> Co-authored-by: Jimmy Chen <[email protected]> Co-authored-by: Jimmy Chen <[email protected]> Co-authored-by: Mac L <[email protected]> Co-authored-by: lmnzx <[email protected]> Co-authored-by: Javier Chávarri <[email protected]> Co-authored-by: Lion - dapplion <[email protected]>
addresses #7820
I haven't done benches for the change in kurtosis but post results soon