-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: support pushdown alias on dynamic filter with ProjectionExec
#19404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ProjectionExecProjectionExec
ProjectionExecProjectionExec
|
@jackkleeman @adriangb hi, I added the projection alias support in #17246, since you have the most context on this, could you please take a look when you have a chance? |
ee4e327 to
4775fc7
Compare
4775fc7 to
0ccefc8
Compare
adriangb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Just needs some tweaks and more tests
|
Added tests in |
8544e36 to
6c9e95b
Compare
2b6a9a5 to
b50b1ab
Compare
| glob = { workspace = true } | ||
| insta = { workspace = true } | ||
| paste = { workspace = true } | ||
| pretty_assertions = "1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to already be used elsewhere (this is not a net new depednecy), so I think it is ok to add
I have some doubt about leave it unchanged, say if in this case: I guess my point is that replace not found column with |
I just don't think this will happen. If a plan is introducing an |
01c783c to
8bc0bd4
Compare
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
…test: filter pushdown projection Signed-off-by: discord9 <discord9@163.com>
…iter&test: unit test Signed-off-by: discord9 <discord9@163.com>
…t assertions Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
… for clarity test: add test for filter pushdown with swapped aliases test: update dynamic filter projection pushdown test name for consistency Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
354f328 to
d3a9259
Compare
|
@adriangb done, now |
Signed-off-by: discord9 <discord9@163.com>
| @@ -0,0 +1,383 @@ | |||
| // Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be annoying - one last request - could we rename this to column_rewriter.rs to avoid a util.rs module that as of today only holds one thing?
I would push to your repo / open a PR but I think your repo settings don't allow it.
Here's the diff:
diff --git a/datafusion/physical-plan/src/util.rs b/datafusion/physical-plan/src/column_rewriter.rs
similarity index 100%
rename from datafusion/physical-plan/src/util.rs
rename to datafusion/physical-plan/src/column_rewriter.rs
diff --git a/datafusion/physical-plan/src/lib.rs b/datafusion/physical-plan/src/lib.rs
index 79c0e9ef5..9352a143c 100644
--- a/datafusion/physical-plan/src/lib.rs
+++ b/datafusion/physical-plan/src/lib.rs
@@ -68,6 +68,7 @@ pub mod async_func;
pub mod coalesce;
pub mod coalesce_batches;
pub mod coalesce_partitions;
+pub mod column_rewriter;
pub mod common;
pub mod coop;
pub mod display;
@@ -92,7 +93,6 @@ pub mod streaming;
pub mod tree_node;
pub mod union;
pub mod unnest;
-pub mod util;
pub mod windows;
pub mod work_table;
pub mod udaf {
diff --git a/datafusion/physical-plan/src/projection.rs b/datafusion/physical-plan/src/projection.rs
index 1e9671900..8d4c775f8 100644
--- a/datafusion/physical-plan/src/projection.rs
+++ b/datafusion/physical-plan/src/projection.rs
@@ -26,13 +26,13 @@ use super::{
DisplayAs, ExecutionPlanProperties, PlanProperties, RecordBatchStream,
SendableRecordBatchStream, SortOrderPushdownResult, Statistics,
};
+use crate::column_rewriter::PhysicalColumnRewriter;
use crate::execution_plan::CardinalityEffect;
use crate::filter_pushdown::{
ChildFilterDescription, ChildPushdownResult, FilterColumnChecker, FilterDescription,
FilterPushdownPhase, FilterPushdownPropagation, PushedDownPredicate,
};
use crate::joins::utils::{ColumnIndex, JoinFilter, JoinOn, JoinOnRef};
-use crate::util::PhysicalColumnRewriter;
use crate::{DisplayFormatType, ExecutionPlan, PhysicalExpr};
use std::any::Any;
use std::collections::HashMap;
Signed-off-by: discord9 <discord9@163.com>
|
thank you @discord9 !! |
Which issue does this PR close?
Rationale for this change
For dynamic filter to work properly, table scan must get correct column even if it's passing through alias(by
ProjectionExec) hence need to modify parent filter whengather_filters_for_pushdownWhat changes are included in this PR?
as title, add support for handling simple alias in pushdown filter, which expand aliased column(in pushdown filter) to it's original expressions(or not pushdown if can't found aliased column in pushdown filter) so alias in projection is supported, also added unit tests.
AI Content disclaim: the core logic is hand written and thoroughly understood, but unit test are largely generated with some human guidance
Are these changes tested?
Unit tests&slt are added, please comment if more tests are needed
Are there any user-facing changes?
Yes, dynamic filter will work properly with alias now, I'm not sure if that count as breaking change though?