Your Environment
- Presto version used:
presto-native-tests branch
- Velox version used: submodule
7dcf49cee4d8988f14ee274949a0c35d9052d6ea
- Storage (HDFS/S3/GCS..): local native test data generated under
presto-native-tests/target/velox_data/PARQUET
- Data source and connector used: Hive connector through Presto native tests,
PARQUET
- Deployment (Cloud or On-prem): local Prestissimo debug worker,
WORKER_COUNT=1, sidecarEnabled=true
- Pastebin link to the complete debug logs: N/A; focused repro output and worker log notes are included below.
Expected Behavior
All three correlated scalar subqueries below should fail because the scalar subquery can produce more than one row for at least one outer row:
Scalar sub-query has returned multiple rows
This is the behavior asserted by AbstractTestQueries.testCorrelatedNonAggregationScalarSubqueries.
Current Behavior
The native engine only fails the integer constant projection case. The two string projection cases incorrectly complete with an empty result:
first=succeeded: MaterializedResult{rows=[], types=[varchar(25)], setSessionProperties={}, resetSessionProperties=[], clearTransactionId=false};
second=succeeded: MaterializedResult{rows=[], types=[varchar(25)], setSessionProperties={}, resetSessionProperties=[], clearTransactionId=false};
third=failed: Scalar sub-query has returned multiple rows native.default.fail(28:INTEGER, Scalar sub-query has returned multiple rows:VARCHAR) Top-level Expression: and(switch(native.default.eq(true:BOOLEAN, is_distinct), true:BOOLEAN, cast((native.default.fail(28:INTEGER, Scalar sub-query has returned multiple rows:VARCHAR)) as BOOLEAN)), native.default.eq(1:INTEGER, expr))
The two successful empty results hide the cardinality violation instead of raising the scalar-subquery error.
Possible Solution
Preliminary root-cause analysis points at the native execution of the decorrelated scalar-subquery cardinality check. The failing integer case shows the native plan evaluating an is_distinct marker guard around native.default.fail(...). The string cases appear to let the outer comparison evaluate to false and return no rows before the multiple-row guard is surfaced.
The fix should ensure the scalar-subquery cardinality guard is evaluated independently of whether the outer predicate ultimately matches. A useful starting point is the Presto-to-Velox translation and expression evaluation around MarkDistinct/is_distinct and generated native.default.fail(...) predicates for correlated scalar subqueries.
Steps to Reproduce
Run the native test repro that executes the following queries:
SELECT name
FROM nation n
WHERE 'AFRICA' = (
SELECT 'bleh'
FROM region
WHERE regionkey > n.regionkey
);
SELECT name
FROM nation n
WHERE 'AFRICA' = (
SELECT name
FROM region
WHERE regionkey > n.regionkey
);
SELECT name
FROM nation n
WHERE 1 = (
SELECT 1
FROM region
WHERE regionkey > n.regionkey
);
Context
This was uncovered with prestodb/presto#23671. The affected native test is AbstractTestQueriesNative.testCorrelatedNonAggregationScalarSubqueries, where two string-projection multiple-row assertions had been disabled pending investigation.
Your Environment
presto-native-testsbranch7dcf49cee4d8988f14ee274949a0c35d9052d6eapresto-native-tests/target/velox_data/PARQUETPARQUETWORKER_COUNT=1,sidecarEnabled=trueExpected Behavior
All three correlated scalar subqueries below should fail because the scalar subquery can produce more than one row for at least one outer row:
This is the behavior asserted by
AbstractTestQueries.testCorrelatedNonAggregationScalarSubqueries.Current Behavior
The native engine only fails the integer constant projection case. The two string projection cases incorrectly complete with an empty result:
The two successful empty results hide the cardinality violation instead of raising the scalar-subquery error.
Possible Solution
Preliminary root-cause analysis points at the native execution of the decorrelated scalar-subquery cardinality check. The failing integer case shows the native plan evaluating an
is_distinctmarker guard aroundnative.default.fail(...). The string cases appear to let the outer comparison evaluate to false and return no rows before the multiple-row guard is surfaced.The fix should ensure the scalar-subquery cardinality guard is evaluated independently of whether the outer predicate ultimately matches. A useful starting point is the Presto-to-Velox translation and expression evaluation around
MarkDistinct/is_distinctand generatednative.default.fail(...)predicates for correlated scalar subqueries.Steps to Reproduce
Run the native test repro that executes the following queries:
Context
This was uncovered with
prestodb/presto#23671. The affected native test isAbstractTestQueriesNative.testCorrelatedNonAggregationScalarSubqueries, where two string-projection multiple-row assertions had been disabled pending investigation.