-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[WIP] KAFKA-19588: Reduce number of events generated in AsyncKafkaConsumer.poll() #20363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
…poll() We create—and wait on—PollEvent in Consumer.poll() to ensure we wait for reconciliation and/or auto-commit. However, reconciliation is relatively rare, and auto-commit only happens every N seconds, so the remainder of the time, we should try to avoid sending poll events.
@lianetm Could you add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kirktrue , took a first look, one high level concern for now
// the interval time or reconciling new assignments | ||
applicationEventHandler.add(event); | ||
|
||
if (reconciliationInProgress.get() || autoCommitState.shouldAutoCommit()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we end up with a race condition here if the app thread sees autoCommitState.shouldAutoCommit()
false at this point (because interval hasn't expired just yet), but by the time the background checks the same when processing the poll event the interval expired?
In that case, I expect the background would trigger the auto-commit while the app thread moved onto updating positions for fetching (and that leads to a whole new set of race conditions that we already dealt with before). Basically, whatever change we introduce here to not wait on Poll, needs to ensure that we retrieve the positions to commit before moving on to update fetch positions, that's the main challenge with this change I expect. Thinking, but not sure yet how to address that if we don't wait on Poll. Thougts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @lianetm that this is opening up the risk of race conditions. However, I think the principle here is a good one. The risk part here is related to the auto-commit timer. If auto-commit is not enabled, we absolutely know that we are not racing with the auto-commit timer. If it is enabled, we are potentially in a race. So, a slight twist on this can safely optimise when auto-commit is not enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of trivial initial comments. I've skimmed over the PR and understand the overall flow now. I'll do a more in-depth review shortly.
* Reset the auto-commit timer to the provided time (backoff), so that the next auto-commit is | ||
* sent out then. If auto-commit is disabled this will perform no action. | ||
*/ | ||
void resetTimer(long retryBackoffMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why not final long retryBackoffMs
also?
this.log = logContext.logger(AutoCommitState.class); | ||
this.timer = time.timer(autoCommitInterval); | ||
this.autoCommitInterval = autoCommitInterval; | ||
this.hasInflightCommit = new AtomicBoolean(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the presence of synchronized
on all of these methods make the use of AtomicBoolean
redundant?
We create—and wait on—PollEvent in Consumer.poll() to ensure we wait for
reconciliation and/or auto-commit. However, reconciliation is relatively
rare, and auto-commit only happens every N seconds, so the remainder of
the time, we should try to avoid sending poll events.