-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-19766: ABFS. Force Blob Instance for FNS Accounts #8152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
Test Results============================================================
|
| */ | ||
| private boolean isClosed = true; | ||
| private final String fileSystemId = UUID.randomUUID().toString(); | ||
| private final String DFS_DOMAIN_INDICATOR = ".dfs."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a constant already defined for this, we can use that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| dfsDomain); | ||
|
|
||
| // Initialize filesystem with DFS endpoint | ||
| AzureBlobFileSystem fs = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be inside try to autoclose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
| throw new InvalidConfigurationValueException(FS_AZURE_ACCOUNT_IS_HNS_ENABLED, ex); | ||
| } | ||
|
|
||
| // For FNS-DFS accounts, reset the endpoint to Blob and update the tracing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use block comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| */ | ||
| private boolean isClosed = true; | ||
| private final String fileSystemId = UUID.randomUUID().toString(); | ||
| private final String DFS_DOMAIN_INDICATOR = ".dfs."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| * Resets all service types to use BLOB. | ||
| * Updates the client to reflect the new default service type. | ||
| */ | ||
| public void resetEndpointforFNS() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: camel case in method name
| * | ||
| * @return the default ingress service type | ||
| */ | ||
| public AbfsServiceType getDefaultIngressServiceType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
methond name should be getngressServiceType()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| * | ||
| * @return the default service type | ||
| */ | ||
| public AbfsServiceType getDefaultServiceType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not used, can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| * @return the configured AbfsServiceType from the URL | ||
| */ | ||
| public AbfsServiceType getFsConfiguredServiceTypeFromURL() { | ||
| return fsConfiguredServiceType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Better to change this variable name also to fsConfiguredServiceTypeFromUrl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
| * Sets the configured service type to BLOB. | ||
| * Majorly required to correctly set user agent for FNS-Blob | ||
| */ | ||
| void setFsConfiguredServiceTypetoBlob() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: camel case in method name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to have generic method that can accept Service Type and caller can call it with Blob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| */ | ||
| public void resetEndpointforFNS() { | ||
| getAbfsConfiguration().setFsConfiguredServiceTypetoBlob(); | ||
| getClientHandler().setDefaultServiceType(AbfsServiceType.BLOB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeated use of gets, can be initialized in a variable
| </property> | ||
| ``` | ||
|
|
||
| - ABFS allows you to implement your custom SAS Token Provider. The declared class must implement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this expected to be covered as part of this PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We had earlier mentioned to add configs for both OAuth and delegated SAS by referring to the sections above. For SAS however, the configs remain constant since we dont have multiple implementations of it (unlike OAuth).
Would be better to mention all fixed configs ones at one place so user only needs to check configs for OAuth implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better readability since its less scattered now
| import static org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys.FS_AZURE_ACCOUNT_KEY_PROPERTY_NAME; | ||
| import static org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys.FS_AZURE_INGRESS_SERVICE_TYPE; | ||
| import static org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys.accountProperty; | ||
| import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- imports should be reverted back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| import static org.apache.hadoop.fs.azurebfs.services.RetryPolicyConstants.EXPONENTIAL_RETRY_POLICY_ABBREVIATION; | ||
| import static org.apache.hadoop.test.LambdaTestUtils.intercept; | ||
| import static org.mockito.ArgumentMatchers.any; | ||
| import static org.mockito.Mockito.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
| * @throws Exception if the test fails | ||
| */ | ||
| @Test | ||
| public void testFNSDfsUsesBlobInstance() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ingress type validations should also be added that no matter what user configures for FNS it should be BLOB and also around the changes made in AbfsOutputStream class for dfstoBlobFallback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
| ContextEncryptionAdapter contextEncryptionAdapter, | ||
| TracingContext tracingContext) { | ||
| int bufferSize = abfsConfiguration.getWriteBufferSize(); | ||
| getClientHandler().initServiceType(abfsConfiguration); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it needed? During client handler init we do initialize the service types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was purely was test purpose. We had some tests that validate ingress handlers, clients by changing the service type using the configuration
Now since we use clientHandler as source of truth for these values- we were trying to reset it here. For users- nothing changes since they cant change the config value mid-workload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rethought on this- better to remove it and change the tests to set service types through clienthandler instead before validating
This comment was marked as outdated.
This comment was marked as outdated.
|
💔 -1 overall
This message was automatically generated. |
| * Restricts the service type to BLOB for the current client handler and configuration when FNS account detected | ||
| * This method sets the ingress service type and the configured service type to {@code AbfsServiceType.BLOB}. | ||
| */ | ||
| public void restrictServiceTypeToBlob() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 2 different methods?
Can't we combine them. Any way they are called together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Combined them
anujmodi2021
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
💔 -1 overall
This message was automatically generated. |
bhattmanish98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
| private String header = EMPTY_STRING; | ||
| private String ingressHandler = EMPTY_STRING; | ||
| private Boolean fnsEndpointConverted = false; | ||
| private String fnsEndptConvertedIndicator = "T"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment to specify reason to choose T as an indicator.
|
💔 -1 overall
This message was automatically generated. |
| private AbfsRestOperation executedRestOperation; | ||
| private String continuationToken; | ||
|
|
||
| /** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a requirement from Javadoc report?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
|
Spotbug reported is not related to the changes done in this patch. |
bhattmanish98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Description of PR
JIRA: https://issues.apache.org/jira/browse/HADOOP-19766
We have observed cases of ABFS driver being initialised with DFS for FNS accounts which is not a recommended endpoint by the service.
To prevent this, we’re normalizing endpoint to Blob for FNS accounts even if they were initialized with DFS endpoint. We also shall restrict usage of a separate ingress service type for FNS accounts.
How was this patch tested?
Test suite was run. Results of which are added in the comments below