-
Notifications
You must be signed in to change notification settings - Fork 24
Initial replication checksums #343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: 4f4b3d5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 12 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description and changes make sense and look good to me. I'm not familiar enough with this part of the service to be good reviewer though.
#341 introduced pre-computing of checksums when compacting, and also triggered a compact as part of initial replication. However, the compacting was not resumeable like the rest of initial replication, and could also be very slow. That could lead to an indefinite loop of trying to compact, and failing with a timeout before being able to mark initial replication as complete, often with a
Replication error Unable to do postgres query on ended connection
error.This changes the pre-computing after initial replication to use the same checksum calculations as for normal API requests. Then it also changes that checksum calculation to handle timeouts by falling back to a piece-wise checksum calculation. This is slower for many buckets, but is more robust, so it's only used as a fall-back.
This also fixes a performance issue with the compact: The previous implementation only read a single batch of 101 documents at a time, even though MongoDB scanned up to the limit of 10_000 documents behind the scenes. We now read all 10_000 documents in the batch.