-
Notifications
You must be signed in to change notification settings - Fork 4k
Ensures pending counter in rabbit_shovel_status is always an integer #14614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bccc0d4
to
fe6fb4c
Compare
The core team has no objection to this If we consider this a bug fix, it can go into |
status(_) -> | ||
running. | ||
|
||
pending_count(#{source := #{current := #{unacked_message_q := UAMQ}}}) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending messages in AMQP0.9.1 are counted in the destination side, thus in local shovels they should be dest -> unconfirmed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I see it correctly there is no flow control between a local shovel and a queue so there can be no pending aka buffered messages for local shovels (according to my above definition of pending).
dest -> unconfirmed is only non-zero in case of on-confirm ack-mode. source -> unacked_message_q can be non-empty for on-confirm and on-publish.
%% Destination not yet connected | ||
ignore. | ||
|
||
pending_count(#{dest := Dest}) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending messages in AMQP0.9.1 are counted in the destination side, thus in AMQP1.0 shovels they should be dest -> unacked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is that in case of AMQP 1.0 the dest -> unacked field is only used in on-confirm ack-mode. But there can be pending messages in case of other ack-modes too when there is no more link credit to send the message to the dest. In this case the whole message is buffered in the shovel process. My understanding is that the pending metric only counts the number of buffered ie unsent messages. Unacked additionally counts messages that were sent but not yet acked by the dest.
There is also the metric remaining_unacked
, but I'm not sure it is counted correctly in case of on-confirm. It is decremented when the message is sent (the same way as for other ack-modes) and not when the message is accepted/rejected by the dest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to sum up my comments I think
- if the pending metric should be the count of messages buffered in the shovel this PR is correct for AMQP 0.9.1 and 1.0 but should always be zero for local shovels
- if the pending metric should be the count of messages not yet acked to the source queue this PR is correct for local shovels but needs to be updated for AMQP 0.9.1 and 1.0
I think exposing the count of buffered messages is more relevant as those can take up considerable memory and less visible in source/dest connection and queue metrics.
%% Destination not yet connected | ||
ignore. | ||
|
||
pending_count(#{dest := Dest}) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is that in case of AMQP 1.0 the dest -> unacked field is only used in on-confirm ack-mode. But there can be pending messages in case of other ack-modes too when there is no more link credit to send the message to the dest. In this case the whole message is buffered in the shovel process. My understanding is that the pending metric only counts the number of buffered ie unsent messages. Unacked additionally counts messages that were sent but not yet acked by the dest.
There is also the metric remaining_unacked
, but I'm not sure it is counted correctly in case of on-confirm. It is decremented when the message is sent (the same way as for other ack-modes) and not when the message is accepted/rejected by the dest.
status(_) -> | ||
running. | ||
|
||
pending_count(#{source := #{current := #{unacked_message_q := UAMQ}}}) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I see it correctly there is no flow control between a local shovel and a queue so there can be no pending aka buffered messages for local shovels (according to my above definition of pending).
dest -> unconfirmed is only non-zero in case of on-confirm ack-mode. source -> unacked_message_q can be non-empty for on-confirm and on-publish.
Thanks for the clarification @gomoripeti Regarding to the |
thank you @dcorbacho for your help and feedback, based on this I added some changes to what the local shovel returns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving per discussion with Diana.
Ensures pending counter in rabbit_shovel_status is always an integer (backport #14614)
Proposed Changes
We observed that when a shovel connection for AMQP091 run into blocked state, pending messages stack up in a list of messages, which leads to the response of
rabbitmq_shovel_status:status()
can get very large and can cause memory leaks with many shovels (or one shovel usingno-ack
). It also breaks the management UI for shovel status.steps to reproduce:
rabbitmq_shovel_status:status()
on the source will show pending as a list of messages.rabbitmq_shovel_status:status()
will show #{pending => {[],[]},This PR ensures that the number of pending messages is an integer for all shovel protocols. I have tried confirming if the bug exist in AMQP1.0, however I get other issues then which I am currently investigating.
Types of Changes
What types of changes does your code introduce to this project?
Put an
x
in the boxes that applyChecklist
Put an
x
in the boxes that apply.You can also fill these out after creating the PR.
This is simply a reminder of what we are going to look for before merging your code.
CONTRIBUTING.md
documentFurther Comments
If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution
you did and what alternatives you considered, etc.