Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicate pushdown through exclude_columns #25117

Open
takezoe opened this issue Feb 23, 2025 · 3 comments
Open

Predicate pushdown through exclude_columns #25117

takezoe opened this issue Feb 23, 2025 · 3 comments

Comments

@takezoe
Copy link
Member

takezoe commented Feb 23, 2025

Currently, predicates are not pushed down to table scan when we use exclude_columns.

SELECT * 
FROM TABLE(exclude_columns(
  input => TABLE(tpch.sf1.customer), 
  columns => DESCRIPTOR(c_comment)
))
WHERE c_custkey = 1

Of course, we can write as follows:

SELECT * 
FROM TABLE(exclude_columns(
  input => TABLE(SELECT * FROM tpch.sf1.customer WHERE c_custkey = 1), 
  columns => DESCRIPTOR(c_comment)
))

However, it would be great if predicates will be pushd down through exclude_columns even in the first case.

@findinpath
Copy link
Contributor

@kasiafi pls triage.

@kasiafi
Copy link
Member

kasiafi commented Feb 28, 2025

If we wanted to implement predicate pushdown into table function on the engine level, we must consider that:

  1. Table functions take two types of table arguments:
  • arguments with row semantics -- the function result is computed on row basis
  • arguments with set semantics -- the function result is computed on partition basis
    It is legal to push predicate into a row semantics source, but not into a set semantics source (unless the predicate is based on the partitioning columns and the source is PRUNE WHEN EMPTY).
  1. Table functions output two types of columns
  • proper columns -- produced by the table function
  • pass-through columns -- passed directly from the input
    We can only push predicates based on the pass-through columns.

If we implemented a generic pushdown rule with the above limitations, sadly the exclude_columns function would not benefit. It has a row semantics source, but it is implemented in the way that all columns are proper, not pass-through.

To support pushdown for exclude_columns as well as for other table functions which cannot be handled in the generic way, we would have to extend the SPI so that the connector can decide about pushdown based on the semantics of the particular function. It is possible to do but requires discussion and careful design.

@takezoe
Copy link
Member Author

takezoe commented Mar 10, 2025

Ugh... Sounds overkill for this particular purpose. I wonder if simply rewriting exclude_columns to projection could work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants