Skip to content

Java test harness does not load native Velox functions (e.g. map_key_exists) #27679

@kaikalur

Description

@kaikalur

Problem

The Presto optimizer/planner is written in Java, but the runtime is increasingly moving to Velox (C++/native). Functions like map_key_exists are native Velox built-in functions in production, but are only available in the Java test harness through the SqlInvokedFunctionsPlugin (which provides a SQL-invoked fallback implementation).

When an optimizer rule generates plan nodes referencing these native functions (e.g. map_key_exists in PrefilterForLimitingAggregation), the Java-based tests fail because the function is not registered in the test server's function registry.

Current State

  • map_key_exists is a native Velox function (efficient hash probe on maps) in production
  • In the Java codebase, it exists only as a SQL-invoked function in presto-sql-invoked-functions-plugin (RETURN CONTAINS(MAP_KEYS(input), k))
  • The TpchQueryRunnerBuilder (used by most distributed query tests) does not install this plugin by default
  • This means any optimizer that references native Velox functions will fail in the OSS Java tests

Impact

As more optimizer rules are written to target native Velox functions, this gap between the Java test harness and the production runtime will become a recurring issue. Each new usage requires manually adding the plugin to the test builder, which is fragile and easy to miss.

Proposed Solution

Consider one or more of:

  1. Load SqlInvokedFunctionsPlugin by default in TpchQueryRunnerBuilder and other standard test builders so all SQL-invoked function fallbacks are available
  2. Establish a registry mapping from native Velox function names to their Java fallback implementations, so the test harness can automatically resolve them
  3. Document the pattern for optimizer developers: when referencing native functions, ensure the Java fallback exists and the plugin is loaded in tests

Context

Discovered while adding map_key_exists usage in PrefilterForLimitingAggregation (PR #27678). The immediate fix was to install SqlInvokedFunctionsPlugin in TpchQueryRunnerBuilder, but the broader issue affects any optimizer that uses native Velox functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions