Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11475. EC: Verify EC reconstruction correctness on DN #7220

Closed
wants to merge 2 commits into from

Conversation

YuanbenWang
Copy link
Contributor

@YuanbenWang YuanbenWang commented Sep 19, 2024

What changes were proposed in this pull request?

HDFS-15759 shows a good way to prevent potential EC reconstruction correctness on datanode, so maybe we can adapt it to Ozone.

To prevent further data corruption issues, this feature proposes a simple and effective way to verify
EC reconstruction correctness on DataNode at each reconstruction process.

It verifies correctness of outputs decoded from inputs as follows:
1. Decoding an input with the outputs;
2. Compare the decoded input with the original input.

For instance, in RS-6-3, assume that outputs [d1, p1] are decoded from inputs [d0, d2, d3, d4, d5, p0]. 
Then the verification is done by decoding d0 from [d1, d2, d3, d4, d5, p1], and comparing the original
and decoded data of d0.

When an EC reconstruction task goes wrong, the comparison will fail with high probability.
Then the task will also fail and be retried by NameNode.
The next reconstruction will succeed if the condition triggered the failure is gone.

Now this PR has worked over 3 months on our clusters.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11475

How was this patch tested?

Unit tests and online tests.

"reconstruct target containers correctly. When validation fails, " +
"reconstruction tasks will fail.",
tags = ConfigTag.CLIENT)
private boolean ecReconstructValidation = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any drawbacks to having this feature enabled? If we commit it as disabled it will likely never be turned on and we may get burned by a correctness issue this could have caught. If it is relatively safe we may not even want/need a config key to disable it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your reviewing! I have updated the description of this PR. In fact, we use the data obtained after decoding for another decoding and compare it with the original data to verify whether the decoding is correct. This may affect the speed of reconstruction. Maybe whether to enable this feature may need to be decided by user.

@sodonnel
Copy link
Contributor

For Ozone, we have committed a "stripe checksum" when writing each stripe to the majority of the replicas. Therefore, to prove the correctness of the reconstruction you simple have to form the new stripe checksum, which should be much more efficient that an additional EC pass.

My memory on what the stripe checksum contains is a little lacking, but we added it to handle the corruption case we saw with HDFS, but we never implemented the validation. Ie, it is created on write, but we never use it on reconstruction.

The stripe checksum approach would be the preferred way to perform this validation on Ozone.

@sodonnel
Copy link
Contributor

You can see the stripe checksum is formed in ECKeyOutputStream in private StripeWriteStatus commitStripeWrite(ECChunkBuffers stripe)

It appears to be a concatenation of the checksums of all the chunks across the stripe.

Hopefully there is a way to combined the newly reconstructed chunks with the existing ones to form the same checksum.

@errose28
Copy link
Contributor

What is the plan of action on this change? Should we alter the approach to use stripe checksums for validation?

@sodonnel
Copy link
Contributor

I would suggest altering the approach to use the checksum. It should be a case of locating the correct sequence of checksums in the stripe checksum and verifying the new reconstructed checksums match. Then we know the reconstruction has created the same data.

@YuanbenWang
Copy link
Contributor Author

@sodonnel Thank you for the suggestios! I agree with you that it's more convenient and efficient to verify EC reconstruction by stripe checksum in Ozone. Maybe I need to shut down this PR and learn more about how checksum works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants