HDDS-10983. EC Key read corruption when the replica index of container in DN mismatches #6779

swamirishi · 2024-06-06T07:42:27Z

What changes were proposed in this pull request?

The ReadChunk & GetBlock APIs don't validate replicaIndex parameter in case of replica index mismatch. This leads to read corruption when there is an already existing open stream, when background SCM service moves container data around datanodes.

The patch aims to add a validation check on the Datanode on the getBlock request and also pass replication index as part of the GetBlock and ReadChunk request.

Things done in the patch:

Make ReadChunk & GetBlock request send replication index in ChunkInputStream.(This comes from pipeline supplier in ECBlockInputStream)
Validate replica index passed in the GetBlock & ReadChunk request with the container data present in the datanode & fail if it mismatches.
The changes should be compatible with older clients(this may still lead to read data corruption, but this shouldn't block the reads entirely). I have added another version in OzoneClient to mark the patch, I am using the optional parameter version in ContainerCommandRequestProto.
Refactor all ContainerCommandRequestProto object creation throughout the code to call a central function which sets the aforementioned version parameter with the latest version of client.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10983

How was this patch tested?

Unit Tests and Integration tests

…r in DN mismatches Change-Id: Ic88575f31305bde9d78b0e4d0c7bbf25c53a7ccb

Change-Id: I4386a0de5d61d1cec51f63ed7f75d531fa389e9c

Change-Id: Ic1175d69441957382fb9213f7e37e3047d284806

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/client/BlockID.java

sodonnel · 2024-06-07T09:05:27Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/ClientVersion.java

@@ -42,6 +42,9 @@ public enum ClientVersion implements ComponentVersion {
      "This client version has support for Object Store and File " +
          "System Optimized Bucket Layouts."),

+  ERASURE_CODING_READ_CHUNK_CORRUPTION_FIX(4,


Do we need this new client version? Can we not simple check on the server that:

if (blockIDProto.hasReplicaIndex()) { // do the verify. }

I suspect we add new fields in places fairly often without creating new client versions.

I wonder what the trigger for a new client version should be, verses just checking for the presence of a field in the proto?

I don't think we need the new version constant (and all the refactoring to be able to use it).

New client version constant is needed when server has to decide whether it should send some piece of information (e.g. a new type of replication config, new bucket type, or new ports) to the client.

If server only wants to decide whether to perform some server-side processing, it can do so based on request content received from client. Like if client sends chunk data as a list, it can handle that, otherwise handle the single blob.

Yeah we want to enforce that the block is validated everytime we do a Getblock command or ReadChunk command for newer client versions. This is to ensure that the newer client doesn't have the same regression in some other flows.

This failure for newer client would atleast raise an alarm when things fail.

I have ensured to the best of knowledge taken care of all the existing flows. But in future, if we forget to set replicaIndex in some other request, it should fail.

Thanks for explaining why the new version is required.

READ_CHUNK_CORRUPTION_FIX is too generic.

Let's rename the constant to something like EC_REPLICA_INDEX_REQUIRED_IN_BLOCK_REQUEST. This describes the expected client behavior.

Please also update the description text.

EC_REPLICA_INDEX_REQUIRED_IN_BLOCK_REQUEST sounds good to me too.

sodonnel

The approach looks good to me. I have a few questions I added as comments. Thanks for working on this.

Change-Id: Ifd60cb93ccb5e6ff7c81995734459719f6d5ecc3

Change-Id: I8f6e02e917dbc4d99294ca3c6d61109b1dafb897

sodonnel

LGTM when the find bugs are fixed and CI is green.

adoroszlai · 2024-06-07T12:59:04Z

...ner-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/BlockUtils.java

+   * Verify if request block BCSID is supported.
+   *
+   * @param container container object.
+   * @param blockID requested block info
+   * @throws IOException if cannot support block's blockCommitSequenceId


nit: comments leftover from verifyBCSId

adoroszlai · 2024-06-07T13:04:55Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/ClientVersion.java

@@ -42,6 +42,9 @@ public enum ClientVersion implements ComponentVersion {
      "This client version has support for Object Store and File " +
          "System Optimized Bucket Layouts."),

+  ERASURE_CODING_READ_CHUNK_CORRUPTION_FIX(4,


I don't think we need the new version constant (and all the refactoring to be able to use it).

New client version constant is needed when server has to decide whether it should send some piece of information (e.g. a new type of replication config, new bucket type, or new ports) to the client.

If server only wants to decide whether to perform some server-side processing, it can do so based on request content received from client. Like if client sends chunk data as a list, it can handle that, otherwise handle the single blob.

adoroszlai · 2024-06-07T16:37:49Z

hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java

      xceiverClient = xceiverClientFactory.acquireClientForReadData(
          pipelineSupplier.get());
+      updateDatanodeBlockId();


Please store Pipeline pipeline = pipelineSupplier.get() and pass it to updateDatanodeBlockId().

There is one more pipelineSupplier.get() leftover in updateDatanodeBlockId():

ozone/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java

Line 301 in 1476d59

int replicaIdx = pipelineSupplier.get().getReplicaIndex(closestNode);

adoroszlai · 2024-06-07T16:41:36Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/client/BlockID.java

+  public void setReplicaIndex(Integer replicaIndex) {
+    this.replicaIndex = replicaIndex;
+  }


This is unused. If removed, replicaIndex member can be final.

swamirishi · 2024-06-07T16:52:36Z

@adoroszlai We want to enforce replica index is sent by the newer clients all the time. This could cause another regression otherwise in the future if someone forgets to set the replica index. The proto parameter replicaIndex is optional, I see this server side check is the only way to achieve this. If there is some other way you think this could be done, I am open to suggestions.

adoroszlai · 2024-06-07T16:56:11Z

We want to enforce replica index is sent by the newer clients all the time. This could cause another regression otherwise in the future if someone forgets to set the replica index. The proto parameter replicaIndex is optional, I see this server side check is the only way to achieve this. If there is some other way you think this could be done, I am open to suggestions.

Thanks for the explanation. You are right, client version can be used to enforce different client behavior depending on version, too.

swamirishi · 2024-06-07T17:04:45Z

We want to enforce replica index is sent by the newer clients all the time. This could cause another regression otherwise in the future if someone forgets to set the replica index. The proto parameter replicaIndex is optional, I see this server side check is the only way to achieve this. If there is some other way you think this could be done, I am open to suggestions.

Thanks for the explanation. You are right, client version can be used to enforce different client behavior depending on version, too.

I can add a unit test case to ensure, older client requests don't fail.

Change-Id: If2af2959ab79ee0ffd275c21909c9dc86e1c2c38

Change-Id: I44de245342a7622d1c3fd4b49dd1db4c2bcac5b4

Change-Id: I9a8905dad8f4db731b110644fa03e9e80e1df859

adoroszlai · 2024-06-08T06:52:28Z

Thanks @swamirishi for updating the patch. Can you please check test failures, they seem to be related?

Change-Id: I822685eb14a347cd4dfa256a5c14d58ccad6b12e

Change-Id: If2ca384a0e026617fa8893c5dd306c3470971931

Change-Id: I9744461813f5b6729039becb0d02d3934a8908c4

Change-Id: Ib28d150af433ab1f298d7892cd3e243c7b509522

Change-Id: Id81b80a6c4d6137ac287022cd9386ad29ca70194

Change-Id: I2c01971613bf33a26507d2f6b464feb71bfe5d6e # Conflicts: # hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/TestContainerReplication.java

Change-Id: Icb4405b0f899b0a83babd1ea49221f4ffdedf2cd

swamirishi · 2024-06-11T23:54:57Z

@adoroszlai @sodonnel Can you take another look on the patch? I have fixed the acceptance test failures. I will run the CI once I get green ci run on my fork. For the getBlock request I am passing the replicaIndexMap since getBlock command is also sent to get the EC checksum of the blocks which means, even though each of the DN would have different replica index it would still have the same checksum info. The nodes in the pipeline would have different replica indexes.
We also cannot use the pipeline from the xceiverClient object since this object comes from a cache

ozone/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientManager.java

Line 237 in 8db644c

return clientCache.get(key, new Callable<XceiverClientSpi>() {

.
xceiverClient.getPipeline() & pipelineRef.get() may have different replicaIndexes. This is the reason why the acceptance test was failing.

Change-Id: I9fda20fc747f5989cc571ce7d088b5b449a2734f

adoroszlai

Thanks @swamirishi for continuing work on this.

hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockInputStream.java

adoroszlai · 2024-06-13T12:44:01Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/storage/ContainerProtocolCalls.java

    String traceId = TracingUtil.exportCurrentSpan();
    if (traceId != null) {
      builder.setTraceID(traceId);
    }
+    final DatanodeBlockID.Builder datanodeBlockID = blockID.getDatanodeBlockIDProtobufBuilder();
+    int replicaIndex = replicaIndexes.getOrDefault(datanode, 0);


Can we get the replica index from the pipeline here, instead of requiring a map to be passed?

Suggested change

int replicaIndex = replicaIndexes.getOrDefault(datanode, 0);

int replicaIndex = xceiverClient.getPipeline().getReplicaIndex(datanode);

No we cannot use this. My previous commit had the same thing. XceiverClient maintains a cache for the pipeline. 2 pipelines are same if the list of nodes are the same.

ozone/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientManager.java

Line 237 in 8db644c

return clientCache.get(key, new Callable<XceiverClientSpi>() {

So 2 GetBlock requests to the same pipeline can interfere with one another. This was the reason why acceptance test was failing previously.

I see, so we need to use the refreshed pipeline because indexes may be different. In that case, I think we can still pass the pipeline instead of the replicaIndexes map.

Wouldn't this create a confusion between xceiverClient.getPipeline() vs the pipeline object we are sending. That is why I thought it would be just better to send the replicaIndexMap instead.

2 containers having the same nodes but different replica index ordering

This should not be possible right? Placement policy should not allow this to place 2 different indexes in same node.Do we have a scenario?

Does this impact performance? for every getBlock we refreshPipeline ? Even though client has old index and if it is able to read same index block, then things should be fine right?

I had a chat with @swamirishi offline. Current pipeline ids created per ECBlockOutStream is based on dn uuid. There is a chance two distinct containers can be part os same node and pipeline cache can get connection correctly but client also holds pipeline object, which can be pointing to other container rep index as part of dn->replIndex map.

@swamirishi please provide the description what we discussed here for better understanding. Thanks

The standalone pipeline getting initialized sets the pipeline id as the Datanode UUID

ozone/hadoop-hdds/client/src/main/java/org/apache/hadoop/ozone/client/io/ECBlockInputStream.java

Line 179 in c4dc6a0

.setId(PipelineID.valueOf(dataLocation.getUuid())).setReplicaIndexes(

. Now when a client reads a key, it could be reading the key from multiple blocks belonging to different containers.
Consider the case:

DN1-uuid = 1 = Standalone PipelineId Container 1(Block1), | Container 2(Block2) DN1 => R3 | DN1 => R2

The XceiverClientManager maitains a cache based on the pipeline id. So for a standalone pipeline it would be nothing but the DN uuid itself. If the client for the DN has been initialized before then the same client would be reused. But the replicaIndexes for the pipelines are different. Now that we have added a validation on the replication index on the getBlock request. The get Block2 call to getting DN1 will fail if we depend on the pipeline object inside the xceiverClient object, since the pipeline object expects a replicaIndex 3 in DN1 but the actual block2 data present in the datanode is for replica index 2. So the server throws an exception because of this.

The same could also occur when a refresh pipeline as well, wherein the DNs have changed the replica index, the client has updated the pipeline object but the cached XceiverClient has not updated the replicaIndexes. This is the same case @adoroszlai previously mentioned in the thread.

Thanks Swami for the explanation.

adoroszlai · 2024-06-13T12:46:02Z

hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockDataStreamOutput.java

@@ -204,7 +205,7 @@ private DataStreamOutput setupStream(Pipeline pipeline) throws IOException {
    //  it or remove it completely if possible
    String id = pipeline.getFirstNode().getUuidString();
    ContainerProtos.ContainerCommandRequestProto.Builder builder =
-        ContainerProtos.ContainerCommandRequestProto.newBuilder()
+        getContainerCommandRequestProtoBuilder()


The patch is huge (122K), hard to review. Size could be reduced by prefactoring:

introduce this factory method

replace all calls of newBuilder()

in a separate patch, without any functional changes, before the fix.

ok will raise a patch for this.

@adoroszlai @sodonnel I have raised another patch since this patch is becomning too big to review #6812

The other PR ,merged now.

...tegration-test/src/test/java/org/apache/hadoop/ozone/container/TestContainerReplication.java

Change-Id: I371912e557e4770b0bbd0f25e823a186224b79c5

Change-Id: I7868729f0910168e6ed4610943255c79b1b00012

Change-Id: I43551feebf091706bd9ab2bed5d0e671cf3d4077

Change-Id: Id88ef2be0dbbfa1502642f147e9004ffaaa83722

Change-Id: Ib736cce10e071142f9f9c060cc1cbb7e10d700a7

swamirishi · 2024-06-17T21:43:49Z

@adoroszlai @sodonnel Can you review this patch? Now that #6812 is merged.

adoroszlai

Thanks @swamirishi for updating the patch.

adoroszlai · 2024-06-19T16:35:40Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/ClientVersion.java

@@ -42,6 +42,9 @@ public enum ClientVersion implements ComponentVersion {
      "This client version has support for Object Store and File " +
          "System Optimized Bucket Layouts."),

+  ERASURE_CODING_READ_CHUNK_CORRUPTION_FIX(4,


Thanks for explaining why the new version is required.

READ_CHUNK_CORRUPTION_FIX is too generic.

Let's rename the constant to something like EC_REPLICA_INDEX_REQUIRED_IN_BLOCK_REQUEST. This describes the expected client behavior.

Please also update the description text.

adoroszlai · 2024-06-19T16:41:11Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java

    ChunkBuffer data;
    try {
      BlockID blockID = BlockID.getFromProtobuf(
          request.getReadChunk().getBlockID());
      ChunkInfo chunkInfo = ChunkInfo.getFromProtoBuf(request.getReadChunk()
          .getChunkData());
      Preconditions.checkNotNull(chunkInfo);
-
+      if (request.hasVersion() && request.getVersion() >= ERASURE_CODING_READ_CHUNK_CORRUPTION_FIX.toProtoValue()) {


Please create a helper function for this condition (and reuse in handleGetBlock).

adoroszlai · 2024-06-19T16:54:26Z

...ervice/src/main/java/org/apache/hadoop/ozone/container/keyvalue/interfaces/BlockManager.java

+  BlockData getBlock(Container container, BlockID blockID, boolean isReplicaCheckRequired)
      throws IOException;


I think the caller should verify replicaIndex as needed, instead of overloading getBlock. Verification requires only container and blockID, both inputs of the function.

I thought since the function verifys BcsID, it would be logical to verifyReplicaIndex as well. Since like how BCSId is for Ratis. ReplicaIndex is for EC.

I see your point and having those two verifications in the same place has some merit. But bcsID is always verified, while replicaIndex verification is conditional on yet another parameter. So it is entirely the responsibility of the caller. I think this, and the need to increase the size of the BlockManager interface makes having the check inside getBlock worse than having it outside.

Unfortunately in today's date this is conditional because of the need to support the older clients. We can eventually remove this check when we deprecate older clients and always perform the replica index check

This would atleast make the caller aware that there is a replicaIndex check that should happen for EC and would not lead to future regressions. Anyhow there is a default function they need not always make a call to the 3 parameter function

@adoroszlai I have made the change and moved the check outside and removed the boolean parameter

If we would like to make verifyReplicaIdx and verifyBCSId calls consistent (i.e. happen in the same places), we should move verifyBCSId out of getBlock.

Current call hierarchy:

verifyBCSId > BlockManagerImpl.getBlock > BlockManagerImpl.getBlock > FilePerChunkStrategy.readChunk > KeyValueHandler.handleGetSmallFile > KeyValueHandler.handleReadChunk > KeyValueHandler.handleGetSmallFile > KeyValueHandler.handleGetBlock > KeyValueHandler.handleGetCommittedBlockLength > KeyValueHandler.handleReadChunk

So all callers originate from 4 methods of KeyValueHandler. I think that's where verification should happen. Some of them already do, as shown above.

Verifying either bcsID or replicaIndex multiple times for the same handleReadChunk operation is unnecessary.

handleGetSmallFile calls the old getBlock, but I think it should also verifyReplicaIndex.

So the two verify... calls only need to be added in handleGetBlock and handleGetSmallFile.

Can handleGetSmall have EC replication?

Since there is no EC replication there we are going to be fetching the data as it is we don't need the verify replica index for Small files

We can take up this refactoring task item as part of another patch.

Change-Id: I80025cd3e99c3cf44887e85871261467323d29b9

Change-Id: If7339e4092797573a63d861092fe053b3653454e

umamaheswararao · 2024-06-20T06:11:14Z

...ner-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/BlockUtils.java

+   */
+  public static void verifyReplicaIdx(Container container, BlockID blockID)
+      throws IOException {
+    Integer containerReplicaIndex = container.getContainerData().getReplicaIndex();


Can we verify replicaIndex only when container is EC?
If this check is happening with any container, can you think what happens when containers upgraded from old version where there are no replica index persisted? I think we are not persisting replica index for Ratis containers, so replica index will be null for them? Can you verify auto conversion are safe from Integer to primitive?

We could avoid conversion, null values, etc. with a dedicated value (e.g. -1) for "not set". I didn't want to nitpick on that. :)

Yeah, currently it is assigned replicaIndex to Integer object, so it will attempt to do auto conversion for if check validation in next line. so I am worried whether we are taking care of that, otherwise the code like the below can hit NPE. ( ex: Integer i = null; if(i>0) hits NPE)

Container uses primitive for replicaIndex, so NPE is not a concern here.

ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerData.java

Lines 202 to 203 in 59560a1

public int getReplicaIndex() {

return replicaIndex;

However, it is unnecessarily boxed, only to be able to use equals later.

container.getContainerData().getReplicaIndex() returns a primitive int. So there is no chance of NPE. I am converting primitive to non primitive Integer. I wanted to use .equals method so that I don't have to do the null check for blockID.getReplicaIndex() which can return a null value.

Got it, thanks for the details.

swamirishi · 2024-06-20T15:39:55Z

Thanks for reviews on the patch @adoroszlai @sodonnel @umamaheswararao

…r in DN mismatches (apache#6779) (cherry picked from commit 769d09e)

swamirishi added 3 commits June 5, 2024 20:40

HDDS-10983. EC Key read corruption when the replica index of containe…

a1d5c4a

…r in DN mismatches Change-Id: Ic88575f31305bde9d78b0e4d0c7bbf25c53a7ccb

HDDS-10983. Add Testcase

8815940

Change-Id: I4386a0de5d61d1cec51f63ed7f75d531fa389e9c

HDDS-10983. Make the read api compatible with older clients

4287270

Change-Id: Ic1175d69441957382fb9213f7e37e3047d284806

sodonnel reviewed Jun 7, 2024

View reviewed changes

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/client/BlockID.java Outdated Show resolved Hide resolved

sodonnel reviewed Jun 7, 2024

View reviewed changes

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/client/BlockID.java Outdated Show resolved Hide resolved

sodonnel reviewed Jun 7, 2024

View reviewed changes

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/client/BlockID.java Outdated Show resolved Hide resolved

adoroszlai self-requested a review June 7, 2024 09:01

sodonnel reviewed Jun 7, 2024

View reviewed changes

swamirishi added 2 commits June 7, 2024 07:48

HDDS-10983. Remove stacktrace

2bb1b9d

Change-Id: Ifd60cb93ccb5e6ff7c81995734459719f6d5ecc3

HDDS-10983. Fix checkstyle

5886cf4

Change-Id: I8f6e02e917dbc4d99294ca3c6d61109b1dafb897

swamirishi requested a review from sodonnel June 7, 2024 15:13

sodonnel approved these changes Jun 7, 2024

View reviewed changes

adoroszlai reviewed Jun 7, 2024

View reviewed changes

swamirishi added 3 commits June 7, 2024 11:41

HDDS-10983. Address review comments

cfc7415

Change-Id: If2af2959ab79ee0ffd275c21909c9dc86e1c2c38

HDDS-10983. Fix testcase

8edc907

Change-Id: I44de245342a7622d1c3fd4b49dd1db4c2bcac5b4

HDDS-10983. Add client version test

1476d59

Change-Id: I9a8905dad8f4db731b110644fa03e9e80e1df859

adoroszlai marked this pull request as draft June 8, 2024 17:14

swamirishi added 5 commits June 8, 2024 17:16

HDDS-10983. Fix test cases

437594d

Change-Id: I822685eb14a347cd4dfa256a5c14d58ccad6b12e

HDDS-10983. Fix issues

65994f3

Change-Id: If2ca384a0e026617fa8893c5dd306c3470971931

HDDS-10983. Fix testcases

731107c

Change-Id: I9744461813f5b6729039becb0d02d3934a8908c4

HDDS-10983. Fix checkstyle

5857567

Change-Id: Ib28d150af433ab1f298d7892cd3e243c7b509522

HDDS-10983. Fix Acceptance test

dd11beb

Change-Id: Id81b80a6c4d6137ac287022cd9386ad29ca70194

swamirishi force-pushed the HDDS-10983 branch from 0286888 to dd11beb Compare June 11, 2024 23:40

swamirishi added 2 commits June 11, 2024 16:44

Merge remote-tracking branch 'apache/master' into HEAD

4c4e2c7

Change-Id: I2c01971613bf33a26507d2f6b464feb71bfe5d6e # Conflicts: # hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/TestContainerReplication.java

HDDS-10983. Fix checkstyle

0d769da

Change-Id: Icb4405b0f899b0a83babd1ea49221f4ffdedf2cd

swamirishi requested review from sodonnel and adoroszlai June 11, 2024 23:57

swamirishi marked this pull request as ready for review June 12, 2024 01:34

HDDS-10983. change pipeline supplier usage

45279a3

Change-Id: I9fda20fc747f5989cc571ce7d088b5b449a2734f

adoroszlai reviewed Jun 13, 2024

View reviewed changes

swamirishi added 2 commits June 13, 2024 17:49

HDDS-10983. Address review comments and simplify testcase

58b089d

Change-Id: I371912e557e4770b0bbd0f25e823a186224b79c5

HDDS-10983. Convert to mapToInt

e4c0d67

Change-Id: I7868729f0910168e6ed4610943255c79b1b00012

swamirishi requested a review from adoroszlai June 14, 2024 00:53

swamirishi mentioned this pull request Jun 14, 2024

HDDS-11013. Ensure version is always set in ContainerCommandRequestProto #6812

Merged

swamirishi added 3 commits June 17, 2024 14:31

HDDS-11013. Merge upstream master

da98011

Change-Id: I43551feebf091706bd9ab2bed5d0e671cf3d4077

Merge remote-tracking branch 'apache/master' into HEAD

5db065f

Change-Id: Id88ef2be0dbbfa1502642f147e9004ffaaa83722

HDDS-10983. Merge master

b0764e4

Change-Id: Ib736cce10e071142f9f9c060cc1cbb7e10d700a7

adoroszlai reviewed Jun 19, 2024

View reviewed changes

swamirishi added 2 commits June 19, 2024 11:12

HDDS-10983. Address reveiw comments

3ae6768

Change-Id: I80025cd3e99c3cf44887e85871261467323d29b9

HDDS-10983. Move replica index validation outside block manager

040d6d8

Change-Id: If7339e4092797573a63d861092fe053b3653454e

adoroszlai approved these changes Jun 20, 2024

View reviewed changes

umamaheswararao reviewed Jun 20, 2024

View reviewed changes

umamaheswararao approved these changes Jun 20, 2024

View reviewed changes

swamirishi merged commit 769d09e into apache:master Jun 20, 2024
39 checks passed

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024

HDDS-10983. EC Key read corruption when the replica index of containe…

ffc5677

…r in DN mismatches (apache#6779) (cherry picked from commit 769d09e)

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024

HDDS-10983. EC Key read corruption when the replica index of containe…

4bc0262

…r in DN mismatches (apache#6779) (cherry picked from commit 769d09e)

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024

HDDS-10983. EC Key read corruption when the replica index of containe…

7a1bc92

…r in DN mismatches (apache#6779) (cherry picked from commit 769d09e)

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024

HDDS-10983. EC Key read corruption when the replica index of containe…

0fcf2bc

…r in DN mismatches (apache#6779) (cherry picked from commit 769d09e)

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024

HDDS-10983. EC Key read corruption when the replica index of containe…

c1158b4

…r in DN mismatches (apache#6779) (cherry picked from commit 769d09e)

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024

HDDS-10983. EC Key read corruption when the replica index of containe…

91567d9

…r in DN mismatches (apache#6779) (cherry picked from commit 769d09e)

xichen01 mentioned this pull request Jul 18, 2024

[DO NOT MERGE] Backport some fixes, performance optimizations from master to ozone-1.4 #6929 #6964

Merged

	int replicaIndex = replicaIndexes.getOrDefault(datanode, 0);
	int replicaIndex = xceiverClient.getPipeline().getReplicaIndex(datanode);

		BlockData getBlock(Container container, BlockID blockID, boolean isReplicaCheckRequired)
		throws IOException;

HDDS-10983. EC Key read corruption when the replica index of container in DN mismatches #6779

HDDS-10983. EC Key read corruption when the replica index of container in DN mismatches #6779

Conversation

swamirishi commented Jun 6, 2024 • edited Loading

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi Jun 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sodonnel left a comment

Choose a reason for hiding this comment

sodonnel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi commented Jun 7, 2024 • edited Loading

adoroszlai commented Jun 7, 2024

swamirishi commented Jun 7, 2024 • edited Loading

adoroszlai commented Jun 8, 2024

swamirishi commented Jun 11, 2024 • edited Loading

adoroszlai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

umamaheswararao Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi commented Jun 17, 2024

adoroszlai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

swamirishi Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

swamirishi Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adoroszlai Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swamirishi commented Jun 20, 2024

swamirishi commented Jun 6, 2024 •

edited

Loading

swamirishi Jun 7, 2024 •

edited

Loading

swamirishi commented Jun 7, 2024 •

edited

Loading

swamirishi commented Jun 7, 2024 •

edited

Loading

swamirishi commented Jun 11, 2024 •

edited

Loading

swamirishi Jun 13, 2024 •

edited

Loading

umamaheswararao Jun 19, 2024 •

edited

Loading

swamirishi Jun 19, 2024 •

edited

Loading

swamirishi Jun 19, 2024 •

edited

Loading

swamirishi Jun 19, 2024 •

edited

Loading

swamirishi Jun 19, 2024 •

edited

Loading

adoroszlai Jun 20, 2024 •

edited

Loading

swamirishi Jun 20, 2024 •

edited

Loading