Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using case insensitive storage type names #996

Open
kameshsampath opened this issue Feb 13, 2025 · 13 comments · May be fixed by #1022
Open

Using case insensitive storage type names #996

kameshsampath opened this issue Feb 13, 2025 · 13 comments · May be fixed by #1022
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@kameshsampath
Copy link
Contributor

Describe the bug

As developer configuring the storageTypeInfo of a Catalog, I see the storageType attribute is expected to be case sensitive e.g. s3 and S3 are treated differently. As a result my API call fails with message saying unsupported storage type.

To Reproduce

  1. Assuming you got Polaris server running
  2. Create a Catalog with
{
    "catalog": {
        "name": "my_catalog",
        "type": "INTERNAL",
        "readOnly": false,
        "properties": {
            "default-base-location": "s3://mycatalogs"
        },
        "storageConfigInfo": {
            "storageType": "s3",
            "allowedLocations": ["s3://mycatalogs"]
        }
    }
}
  1. The call will fail and checking logs you will find info saying s3 storage type is not support and supported types are ...
  2. Now update the JSON to be
{
    "catalog": {
        "name": "my_catalog",
        "type": "INTERNAL",
        "readOnly": false,
        "properties": {
            "default-base-location": "s3://mycatalogs"
        },
        "storageConfigInfo": {
            "storageType": "S3",
            "allowedLocations": ["s3://mycatalogs"]
        }
    }
}
  1. The API call will succeed.

Actual Behavior

Currently the API call succeeds only with S3 as storageType

Expected Behavior

It will be great if the system can handle the right case insensitive storage type names.

Additional context

No response

System information

OS: Linux Container
Object Storage: S3

@kameshsampath kameshsampath added the bug Something isn't working label Feb 13, 2025
@flyrain flyrain added enhancement New feature or request good first issue Good for newcomers and removed bug Something isn't working labels Feb 13, 2025
@sfc-gh-ygu
Copy link
Contributor

A quick resolution: Polaris could cast the type string to upper cases.

@MonkeyCanCode
Copy link
Contributor

Not sure if it is a good idea to make it case insensitive as all other things are case sensitive. Also, this is validated via following:

polaris.features.defaults."SUPPORTED_CATALOG_STORAGE_TYPES"=["S3","GCS","AZURE","FILE"]

I agreed the quick fix will be enforce the upper case on the catalog create payload if there is no concern. @kameshsampath what do u think?

@kameshsampath
Copy link
Contributor Author

kameshsampath commented Feb 16, 2025

thats sounds good to me @MonkeyCanCode , happy to PR if thats OK

@MonkeyCanCode
Copy link
Contributor

thats sounds good to me @MonkeyCanCode , happy to PR if thats OK

Go for it

@kameshsampath
Copy link
Contributor Author

I just went through the code to see where to patch,

Class 1: I did add case insensitive check in org.apache.polaris.service.admin.PolarisServiceImpl

  private void validateStorageConfig(
      StorageConfigInfo storageConfigInfo, RealmContext realmContext) {
    List<String> allowedStorageTypes =
        configurationStore.getConfiguration(
            realmContext, PolarisConfiguration.SUPPORTED_CATALOG_STORAGE_TYPES);
    if (allowedStorageTypes.stream().noneMatch(storageConfigInfo.getStorageType().name()::equalsIgnoreCase)) {
      LOGGER
          .atWarn()
          .addKeyValue("storageConfig", storageConfigInfo)
          .log("Disallowed storage type in catalog");
      throw new IllegalArgumentException(
          "Unsupported storage type: " + storageConfigInfo.getStorageType());
    }
  }

Class 2: StorageConfigInfo spec/polaris-management-service.yml

But then to make StorageConfigInfo enum always upperCase the value seem not possbile to patch as the class is generated form the Spec unless there is way to let the generator generate code to handle patching.

Thoughts?

@cgpoh
Copy link
Contributor

cgpoh commented Feb 18, 2025

@kameshsampath, @MonkeyCanCode , @flyrain , I apply the patch here: https://github.com/cgpoh/polaris/blob/94a9cdb50a51961cb44981dcaab4008c9de36a2f/service/common/src/main/java/org/apache/polaris/service/config/Serializers.java instead of changing the spec, wdyt?

@kameshsampath
Copy link
Contributor Author

@cgpoh
I am totally with you on not changing the spec for this as having lowercase and uppercase keys on the spec for Enum does not look good. Again that class relies on generated CreateCatalogRequestDeserializer Catalog.class where the StorageConfig is rooted.

Just note out of curiosity why that class is called Seralizers.java than DeSeralizers.java :D

@kameshsampath
Copy link
Contributor Author

ah! @cgpoh I see what you mean

@kameshsampath
Copy link
Contributor Author

@cgpoh @MonkeyCanCode @flyrain @sfc-gh-ygu how does this change look ?

  public static final class CreateCatalogRequestDeserializer
      extends JsonDeserializer<CreateCatalogRequest> {
    @Override
    public CreateCatalogRequest deserialize(JsonParser p, DeserializationContext ctxt)
        throws IOException, JacksonException {
      TreeNode treeNode = p.readValueAsTree();
      if (treeNode.isObject() && ((ObjectNode) treeNode).has("catalog")) {
        ObjectNode catalogTreeNode = (ObjectNode) treeNode;
        JsonNode catalogNode = catalogTreeNode.get("catalog");
        if (catalogNode.has("storageConfigInfo")) {
          ObjectNode storageConfigNode = (ObjectNode) catalogNode.get("storageConfigInfo");
          if (storageConfigNode.has("storageType")) {
            String storageType = storageConfigNode.get("storageType").asText();
            //make sure the storageType are Deserialized to upper case types
            storageConfigNode.put("storageType", storageType.toUpperCase());
          }
        }
        return CreateCatalogRequest.builder()
            .setCatalog(ctxt.readTreeAsValue(catalogNode, Catalog.class))
            .build();
      } else {
        return CreateCatalogRequest.builder()
            .setCatalog(ctxt.readTreeAsValue((JsonNode) treeNode, Catalog.class))
            .build();
      }
    }
  }

If this looks OK, I wonder we need the change to org.apache.polaris.service.admin.PolarisServiceImpl as storageTypes will now always be serialised to UpperCase

@kameshsampath
Copy link
Contributor Author

@cgpoh @MonkeyCanCode @flyrain @sfc-gh-ygu - please let me now if the PR looks OK. I removed the changes to the org.apache.polaris.service.admin.PolarisServiceImpl, as with this change the storageType will always be upper cased

@eric-maynard
Copy link
Contributor

I'm supportive of this change, but I think this is not the right place to do it. It should apply to all endpoints, and possibly to all enums.

@sfc-gh-ygu
Copy link
Contributor

+1 for all endpoints related to this type. We can get to other enums in separated PRs.

@kameshsampath
Copy link
Contributor Author

kameshsampath commented Feb 22, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants