-
Notifications
You must be signed in to change notification settings - Fork 209
refactor: 10x faster RPC splitting #615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
gossipsub.go
Outdated
| // split splits the given RPC If a sub RPC is too large and can't be split | ||
| // further (e.g. Message data is bigger than the RPC limit), then it will be | ||
| // returned as an oversized RPC. The caller should filter out oversized RPCs. | ||
| func (rpc *RPC) split(limit int) iter.Seq[RPC] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should likely be in the pubsub.go file instead. I kept it here to make reviewing easier, but would like to move it after reviewer's approval.
|
benchstat: |
f4afabe to
3fb6361
Compare
algorandskiy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great and impressive +90% perf gain comparing to mine feeb +66% :)
gossipsub.go
Outdated
| lastRPC.Subscriptions = append(lastRPC.Subscriptions, sub) | ||
| out = append(out, lastRPC) | ||
| for _, sub := range rpc.Subscriptions { | ||
| if nextRPC.Subscriptions = append(nextRPC.Subscriptions, sub); nextRPC.Size() > limit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nextRPC.Size() - this is the same issue with calling Size in a loop over and over again as for Publish but probably does not matter much as (from my understanding) new subscriptions as well as control are relatively rare compare to Publish thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly. Notice that we flush the messages to publish above. So here we aren't calling .Size on Publish messages again.
Splitting due to control messages should be extremely rare (happy to be proven wrong), so I opted to do the easier thing and keep the old code.
Thinking about this just now, we could do a fast path check before these loops that check if the original RPC without the publish messages is small enough to fit and can be yielded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this, and it adds a nice small improvement. Many large benchmark is around 3.3 us, and 2 large is around 28 ns.
This assumes it is common for control messages to not be the reason we have to split.
This release contains a couple fixes and the new Batch Publishing feature. - #607 Batch Publishing. Useful if you are publishing a group of related messages at once - #612 Send IDONTWANT before initial publish. Useful when many nodes may publish the same message at once. - #609 Avoid sending an extra "IDONTWANT" to the peer that just sent you a message. - #615 10x faster rpc splitting.
Builds on #582. 10x faster than current master. 0 allocs.
The basic logic is the same as the old version, except we return an
iter.Seq[RPC]and yieldRPCtypes instead of a slice of*RPC. This lets us avoid allocations for heap pointers.Please review @algorandskiy, and let me know if this improves your use case.