Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Get catalogInUse and metalakeInUse from cache instead of retrieve them from storage #6566

Open
yuqi1129 opened this issue Feb 27, 2025 · 2 comments · May be fixed by #6569
Open
Assignees
Labels
improvement Improvements on everything

Comments

@yuqi1129
Copy link
Contributor

yuqi1129 commented Feb 27, 2025

What would you like to be improved?

catalogInUse and metalakeInUse is called frequently and very time-consuming as it directly load data from backend storage, which can be optimized.

Image
 protected <R, E extends Throwable> R doWithCatalog(
      NameIdentifier ident, ThrowableFunction<CatalogManager.CatalogWrapper, R> fn, Class<E> ex)
      throws E {
    checkCatalogInUse(store, ident);

    try {
      CatalogManager.CatalogWrapper c = catalogManager.loadCatalogAndWrap(ident);
      return fn.apply(c);
    } catch (Throwable throwable) {
      if (ex.isInstance(throwable)) {
        throw ex.cast(throwable);
      }
      if (RuntimeException.class.isAssignableFrom(throwable.getClass())) {
        throw (RuntimeException) throwable;
      }
      throw new RuntimeException(throwable);
    }
  }

  public static void checkCatalogInUse(EntityStore store, NameIdentifier ident)
      throws NoSuchMetalakeException, NoSuchCatalogException, CatalogNotInUseException,
          MetalakeNotInUseException {
    NameIdentifier metalakeIdent = NameIdentifier.of(ident.namespace().levels());
    checkMetalake(metalakeIdent, store);

    if (!getCatalogInUseValue(store, ident)) {
      throw new CatalogNotInUseException("Catalog %s is not in use, please enable it first", ident);
    }
  }

  private static boolean getCatalogInUseValue(EntityStore store, NameIdentifier catalogIdent) {
    try {
      CatalogEntity catalogEntity =
          store.get(catalogIdent, EntityType.CATALOG, CatalogEntity.class);
      return (boolean)
          BASIC_CATALOG_PROPERTIES_METADATA.getOrDefault(
              catalogEntity.getProperties(), PROPERTY_IN_USE);

    } catch (NoSuchEntityException e) {
      LOG.warn("Catalog {} does not exist", catalogIdent, e);
      throw new NoSuchCatalogException(CATALOG_DOES_NOT_EXIST_MSG, catalogIdent);

    } catch (IOException e) {
      LOG.error("Failed to do store operation", e);
      throw new RuntimeException(e);
    }
  }

  public static boolean metalakeInUse(EntityStore store, NameIdentifier ident)
      throws NoSuchMetalakeException {
    try {
      BaseMetalake metalake = store.get(ident, EntityType.METALAKE, BaseMetalake.class);
      return (boolean)
          metalake.propertiesMetadata().getOrDefault(metalake.properties(), PROPERTY_IN_USE);

    } catch (NoSuchEntityException e) {
      LOG.warn("Metalake {} does not exist", ident, e);
      throw new NoSuchMetalakeException(METALAKE_DOES_NOT_EXIST_MSG, ident);

    } catch (IOException e) {
      LOG.error("Failed to do store operation", e);
      throw new RuntimeException(e);
    }
  }

After this improvement, QPS of APIs like loadFileset will be increased by as much as 50%.

How should we improve?

No response

@yuqi1129 yuqi1129 added the improvement Improvements on everything label Feb 27, 2025
@Abyss-lord
Copy link
Contributor

I would like to work on it.

@yuqi1129
Copy link
Contributor Author

I would like to work on it.

Sorry, I'm already working on it. If you are interested in this point, you can take part in #6560

@yuqi1129 yuqi1129 self-assigned this Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvements on everything
Projects
None yet
2 participants