You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current scan on a replica is serialized, means there is only 1 scan iterator on a replica, it will be too slow in some cases, for example, copy data in tools.
We can implement parallel scans on a replica:
count the key-value pairs in a replica, say N
manual specify how many scans on a replica, say M
then we can get the start keys by scanning the replica again:
a. start scan the replica again
b. find the ith end key when count N/M key-value pairs
c. loop a~b until we reach the end of replica
d. at last, we may get less or more than M key pairs, since the data in the replica may increase or decrease at the 2nd scan, or some data been expired.
assign each key pair to different scanners(corresponding to client side threads), they can scan parallelly to speed up the whole scan.
NOTE: since the sub-scans may on the same hashkey, but different scan can not ensure atomic, so the total result doesn't provide atomic.
The text was updated successfully, but these errors were encountered:
Current scan on a replica is serialized, means there is only 1 scan iterator on a replica, it will be too slow in some cases, for example, copy data in tools.
We can implement parallel scans on a replica:
a. start scan the replica again
b. find the ith end key when count N/M key-value pairs
c. loop a~b until we reach the end of replica
d. at last, we may get less or more than M key pairs, since the data in the replica may increase or decrease at the 2nd scan, or some data been expired.
NOTE: since the sub-scans may on the same hashkey, but different scan can not ensure atomic, so the total result doesn't provide atomic.
The text was updated successfully, but these errors were encountered: