-
Notifications
You must be signed in to change notification settings - Fork 612
Improve batching work pool performance #252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For those confused by this magic constant: channel 0 is used for "special purpose" protocol communication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I noticed that the current design causes even the MainSession model to be registered with the ConsumerWorkService, even though no consumers can be registered on it. Seemed like a reasonable optimization to skip those registrations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Please leave a comment explaining that, through.
f36ca33 to
0694ce8
Compare
|
@michaelklishin I pushed up a revised commit with a comment about model 0, let me know if it's ok. |
|
@kjnilsson What race condition was 166cf17 trying to fix? I was testing this PR initially against 3.6.5, which didn't have this, and it looks like the ChannelNumber 0 optimization I made here is now throwing an NRE because of it. Same problem should be occurring on #253. |
0694ce8 to
c2535fb
Compare
|
I've rebased this branch on top of the lastest vs-support branch changes, and I went ahead and removed the special model 0 handling that was throwing an exception because of the ordering change. Having that in this branch is much less critical. |
|
The concourse build failed on this test: |
|
@kjnilsson that's likely a timing issue. |
|
@michaelklishin yes just wanted to note it as it isn't yet public. Running the tests again. |
c2535fb to
41a5574
Compare
|
This branch is now rebased against master. For the failing test, yes that is expected on this PR, because the changes to the BatchingWorkPool can't guarantee ordering. Looking at the test and thinking about it more, I can see how this likely isn't an acceptable trade-off, given how acking of multiple delivery tags works. |
|
@bording can you explain where the ordering guarantees are given up with your approach? |
|
If we really give up the ordering guarantee in this PR, it's a no go compared to #253. |
|
@kjnilsson Take a look at The existing implementation uses the @Scooletz and I tried to come up with another way to prevent the same model from being queued multiple times, but everything we came up with either required reintroducing locks (and defeating the point of this change) or could potentially run into the opposite problem of having work items and not scheduling a When we realized this, that's when we went with a more through redesign of the |
|
@michaelklishin If ordering is a hard requirement that can't be relaxed, then I think #253 is the only option, unless there's some way we haven't thought of to bring it back to this PR without reintroducing a lock. I only realized today that it might not be possible to give up ordering because of the multiple delivery tag ack option. We don't use that in our code at all, and I had forgotten that it was even part of the spec. 😢 If ordering wasn't a requirement, than I actually prefer this approach, because it does maintain the ability to pass in a custom task scheduler for concurrency throttling, and since it's not using dedicated threads that block when they don't have work, you see a lot less "synchronization time" when profiling compared to #253. Given that this approach does have some benefits, that was why we initially presented both options. |
|
We cannot give up ordering for backwards compatibility, too. I think the numbers you posted from #253 are very comparable. Not sure how many users provide a custom task scheduler. |
|
Ok I've had a play with the PR to see if I can get the I now agree we should proceed with #253 . It does add a breaking change to a public interface (hello v5!) and a fair bit of internal changes as well but the approach is simple and easy to reason about. I don't think that many use a custom TaskScheduler but with a one to one relationship between |
|
Speaking of versions, we should probably release |
This is the first PR mentioned in #251.
Currently this is based on my vs-support branch from #248, once that's merged, I'll rebase this against master instead.
f36ca33 is what is new in this PR.