-
Notifications
You must be signed in to change notification settings - Fork 47
Retry pushing message 5 times, then reject the message #1545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Claude finished @snichme's task —— View job Code Review CompleteTodo List:
Problems Found:1. Race Condition Risk in MultiDestinationHandler The def push(msg, source) : Bool
if dest = @current_dest
dest.push(msg, source)
else
false
end
end2. Inconsistent Error Handling in HTTPDestination The case @ack_mode
in AckMode::OnConfirm, AckMode::OnPublish
raise FailedDeliveryError.new unless response.success?
source.ack(msg.delivery_tag)
in AckMode::NoAck
end
true # This always returns true even if response.success? failed above3. Missing Error Handling for Reject Operation The retry loop doesn't handle potential exceptions when calling 4. Hard-coded Retry Parameters The retry count (5) and sleep duration (500ms) are hard-coded, making them difficult to configure for different use cases or environments. |
carlhoerberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered a delivery-limit policy? or at least look at the x-delivery-count header?
| case @ack_mode | ||
| in AckMode::OnConfirm | ||
| ch.basic_publish(msg.body_io, ex, rk, props: msg.properties) do | ||
| if ch.basic_publish_confirm(msg.body_io, ex, rk, props: msg.properties) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi, the block to basic_publish takes a boolean too, the result from the confirm (ack/nack).
| if ch.basic_publish_confirm(msg.body_io, ex, rk, props: msg.properties) | |
| ch.basic_publish(msg.body_io, ex, rk, props: msg.properties) do |ok| |
The upside/downside with this approach is that this fiber is blocked until an ack/nack is recievied from the server, hampering the throughput, but also limits the number of messages not confirmed (to 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could reject in the block instead, with requeue if the x-delivery-count is less than 5, and without requeue if > 5. Or even use the delivery-limit queue argument/policy somehow, instead of this hardcoded one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of using deliver limit. But then again, the message will be removed from the queue and "disappear" if there is no DLX configured.
What I wanted to achieve with this PR is that the shovel basically just waits until the destination is available again, as a default behavior.
| push_retries = 0 | ||
| until @destination.push(msg, @source) | ||
| if push_retries >= 5 | ||
| @source.reject(msg.delivery_tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't the message rejected with requeue? that means the message will be retried soon again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to try 5 times and then put the message back in the queue so it could be removed or consumed by another consumer.
Don't want to reject without requeue since the message will disappear if no DLX is configured and the issue related to this PR #1357 was just about disappearing messages.
WHAT is this pull request doing?
Fixes #1357
Checks return value of publish_confirm and if was not publish retries 5 times before rejecting the message.
HOW can this pull request be tested?
There is a script attached to the issue, run that before and after applying this fix.
Before, q1 is just emptied, messages disappears. After this fix the messages are kept in the queue.