Skip to content

Conversation

fsk119
Copy link
Member

@fsk119 fsk119 commented Oct 15, 2025

What is the purpose of the change

Add a rule to convert a Correlate node into a VectorSearchPhysicalNode. Currently, this rule requires the right subtree to be simple — that is, it must consist of either a single TableScan, or a Calc operator followed by a TableScan.

Brief change log

  • Add a rule to convert

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 15, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@lihaosky lihaosky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<Resource name="optimized rel plan">
<![CDATA[
Calc(select=[a, b, c, d, rowtime, PROCTIME_MATERIALIZE(proctime) AS proctime, e, g, PROCTIME_MATERIALIZE(proctime0) AS proctime0, score])
+- VectorSearchTableFunction(table=[default_catalog.default_database.VectorTableWithProctime], joinType=[InnerJoin], columnToSearch=[g], columnToQuery=[d], topK=[10], select=[a, b, c, d, rowtime, proctime, e, g, PROCTIME() AS proctime, score])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proctime appear from both tables, is it a problem?

Copy link
Member Author

@fsk119 fsk119 Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. It just influences the plan display. But I try to fix the problem.

if (parentNode != null) {
throw new RelOptPlanner.CannotPlanException(
String.format(
"%s assumes calc to be the first node in parameter search_table, but it has a parent %s.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test to test this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can add a test here, because the relational structure we currently support is too limited — only scan and calc nodes are allowed.

However, it’s not possible to construct a calc node as a child of a scan node, or to build a tree like calc → calc → scan. This is because calc nodes are often merged or eliminated during optimization (see CalcRemoveRule and CalcMergeRule).

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Oct 17, 2025
@fsk119
Copy link
Member Author

fsk119 commented Oct 17, 2025

Copy link
Contributor

@lihaosky lihaosky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fsk119 fsk119 merged commit 0e4e6d7 into apache:master Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants