Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11244. Draft PR to address multiple issues in garbage collection. Will split it into multiple PRs #7097

Draft
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

swamirishi
Copy link
Contributor

What changes were proposed in this pull request?

Provide a one-liner summary of the changes in the PR Title field above.
It should be in the form of HDDS-1234. Short summary of the change.

Please describe your PR in detail:

  • What changes are proposed in the PR? and Why? It would be better if it is written from third person's
    perspective not just for the reviewer.
  • Provide as much context and rationale for the pull request as possible. It could be copy-paste from
    the Jira's description if the jira is well defined.
  • If it is complex code, describe the approach used to solve the issue. If possible attach design doc,
    issue investigation, github discussion, etc.

Examples of well-written pull requests:

What is the link to the Apache JIRA

Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull
request which starts with the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)

(Please replace this section with the link to the Apache JIRA)

How was this patch tested?

(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)

…. Will split it into multiple PRs

Change-Id: I6915139a407220d782442e28be6e540342af459b
Change-Id: Ib03867d243cdbc9b6f9e36eab23fd5455aa747f1
Change-Id: Ica1ee55d7356a7ec5392b02a3bcd030ce534853b

# Conflicts:
#	hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManager.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyPurgeRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotMoveDeletedKeysRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotPurgeRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/snapshot/OMSnapshotMoveDeletedKeysResponse.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotUtils.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyPurgeRequestAndResponse.java
Change-Id: I3cd843cd6e4075b8bac2ff78bcb35dcc6f55f9bd
omSnapshotManager, snapshotChainManager, currentSnapshotInfo, keyManager.getMetadataManager(), lock);
List<Table.KeyValue<String, String>> renamedTableEntries = keyManager.getRenamesKeyEntries(volume, bucket,
startKey, renameEntryFilter, remainNum);
submitPurgeKeysRequest()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete and invalid line.

@@ -1990,12 +2000,18 @@ message SnapshotPurgeRequest {
repeated string updatedSnapshotDBKey = 2 [deprecated = true];
}

message SnapshotPurgeAndMoveRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to add it in Type?

@@ -1990,12 +2000,18 @@ message SnapshotPurgeRequest {
repeated string updatedSnapshotDBKey = 2 [deprecated = true];
}

message SnapshotPurgeAndMoveRequest {
required hadoop.hdds.UUID snapshotID = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Protobuf suggests not use required from proto-3
https://protobuf.dev/programming-guides/dos-donts/#add-required

/**
* Return transaction info persisted in OM DB skipping cache.
*/
public static TransactionInfo readTransactionInfo(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to read it from DB? It gets flushed to DB as part of the same batch transaction.

addToBatchTransactionInfoWithTrace(lastTraceId,
lastTransaction.getIndex(),
() -> omMetadataManager.getTransactionInfoTable().putWithBatch(
batchOperation, TRANSACTION_INFO_KEY, TransactionInfo.valueOf(lastTransaction)));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, so the DB would have the correct entry right. We cannot be dependent on the cache value right. Till the point the transaction hasn't made to the rocksdb we cannot be sure if the snapshot data has made to the disk.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it matter in this case? We don't update the cache for transactionInfoTable directly and once the transaction is flushed to DB, it gets reflected in the cache.

@@ -124,6 +126,7 @@ public static SnapshotStatus valueOf(SnapshotStatusProto status) {
private long exclusiveSize;
private long exclusiveReplicatedSize;
private boolean deepCleanedDeletedDir;
private ByteString lastTransactionInfo;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it has to be ByteString? Is it for toString()? You can simply keep it as TransactionInfo. Rather than converting it later in OmSnapshotManager.

Copy link
Contributor Author

@swamirishi swamirishi Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransactionInfo is not a proto definition. There is codec for it which encodes & decodes to & from bytes. Proto file will not compile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be fine, you just have to make sure that the Snapshotinfo proto and its codec are correct.

I feel it is an unnecessary conversion from byte[] to ByteString and to TransationInfo.

/**
* Given a volume and bucket, return the corresponding DB key prefix.
*
* @param volume - User name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Volume Name not user name.
  2. Please either write a better Java doc or don't write it. There is no difference between this and next function's description.

/**
* Given a volume and bucket, return the corresponding DB key prefix.
*
* @param volume - User name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Volume Name not user name.

* @param volume - User name
* @param bucket - Bucket name
*/
String getBucketKeyPrefix(String volume, String bucket);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have similar functions in SnapshotUtils. Can you please move them from here?

Copy link
Contributor Author

@swamirishi swamirishi Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be ideally having all such methods in OmMetadataManager not in SnapshotUtils. We shouldn't be having the logic of key layout in some other Util function. Our snapshotUtils has been written very specific to OmMetadataManagerImpl which is just bad code design in my opinion.

Copy link
Contributor Author

@swamirishi swamirishi Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapshot code shouldn't be hitting metadata manager layer. All reads should be made to KeyManager & keyManager should inturn hit metadata layer. From what I understand, this was the intent of creation of these interface initially. We ended up mixing all of this logic everywhere by directly accessing tables from all the places.

Copy link
Contributor Author

@swamirishi swamirishi Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get such similar changes done on the Snapshot diff flow & SSTFiltering service as well. We can create a refactoring jira for it. Could be a good newbie jira. Would be a good opportunity for someone who wants to understand snapshot flow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailing slash is mainly used by snapshots as a key prefix. Also, the comment says that it is meant to be used in snapshots and should be moved out if used out snapshots. Since this will be used at other places, it is a good time to move them to OmMetadataManager.

I'm fine if you create a Jira for it. My point was not to have duplicate functions sitting in multiple places.

@@ -112,6 +112,9 @@ public enum ResultCodes {
FAILED_TO_FIND_CONTAINER,
FAILED_TO_FIND_CONTAINER_WITH_SPACE,
BLOCK_EXISTS,
//OM can send a deleteBlockRequest for key delete more than once. Thus, it shouldn't
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this from the change as well.

Change-Id: I019493535aaebdc779f3431dbc88d8cf0f1c0c1d
Change-Id: I235052073fb814cdc8818b37882d4a047483971d
Change-Id: I40d1c664a0dddd455ba00479457f4b7204bc9d85
Change-Id: I3df0ba8346d2573a5d403b3f58fab8aedb365bda
Change-Id: Ibaf5b526c4f821087212307d07c3847d2d15aa59
…for transactions in flight on the snapshot

Change-Id: Ifa01e118427bd37000eef801a6dc8b12ccfe5d0a
Change-Id: I02ee0ab044a234eff4ead00d1426032060edd44b
Change-Id: I414e862601ceab9dff051484c3bba72546e0bf5a
Change-Id: I55b6d5f4503d3a427ddd13c6e6ae4eac091655e0
Change-Id: I34decc69bf5b4a1c5d5a5e58865c9ede74dae2dc
Change-Id: Idbaf521397c170b381530f86f7fe9f215db5169d
Change-Id: I314f831670449f96dc804337cd30fd115ec9b49c

# Conflicts:
#	hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/SnapshotInfo.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyPurgeRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotMoveDeletedKeysRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotUtils.java
Change-Id: Ia5140a522ae658684068dbe3dca2f5b617d3e904
…are moved from a deleted snapshot to the next snapshot in the chain

Change-Id: Ib48b8aae241efff80fbc8636d3c5e72a49a08a30
Change-Id: Ic1aec0623f096b54280ef1675d6045bf9c517346

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyPurgeRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotMoveDeletedKeysRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotPurgeRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyPurgeRequestAndResponse.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/snapshot/TestSnapshotInfo.java
Change-Id: I48de75c535166c69411b9dead7a463c44cda3518

# Conflicts:
#	hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyPurgeRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotMoveDeletedKeysRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotPurgeRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/DirectoryDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotUtils.java
Change-Id: I04ce37e93e1b072a23f01b28f92a6c71f5d27d07
Change-Id: Id645dcb042055ef53094ddcc8d42e60c037cfadd
Change-Id: I0b43b2593c0de7da9a0c5d44c894aececdc7885f

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Change-Id: I63fc93366d5e1d173348f8dc56ec84ab83634fce
… service

Change-Id: Ib21775038034459f3d2aeab19890ed3e94882ed0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants