Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Start prototyping support for opaque engine expressions #686

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

scovich
Copy link
Collaborator

@scovich scovich commented Feb 7, 2025

What changes are proposed in this pull request?

Kernel does not provide any way for engines to work with expressions kernel does not fully understand. This creates unwelcome pressure to implement operations various engines need, directly in kernel, even tho kernel itself does not use them at all. A canonical example is [NOT] IN, which is highly complex to implement fully and correctly (see e.g. #652), and which was only added in the first place because engines wanted to use it for their own query processing.

The solution is to define an opaque expression trait which engines can use to round-trip any expression they want through the kernel. Initially, kernel can support partition pruning over such expressions if the engine provides an appropriate implementation. Eventually, it might be possible to also express basic data skipping support as well -- but any such effort would be far out of scope for this initial exploration.

This PR affects the following public APIs

Adding more variants to the Expression enum, along with corresponding API support for manipulating it.

How was this change tested?

TODO!

Copy link

codecov bot commented Feb 7, 2025

Codecov Report

Attention: Patch coverage is 8.83978% with 165 lines in your changes missing coverage. Please review.

Project coverage is 83.37%. Comparing base (eedfd47) to head (fccd3b7).

Files with missing lines Patch % Lines
kernel/src/engine/arrow_expression.rs 0.00% 53 Missing ⚠️
kernel/src/expressions/mod.rs 28.07% 39 Missing and 2 partials ⚠️
kernel/src/predicates/mod.rs 0.00% 33 Missing ⚠️
ffi/src/expressions/kernel.rs 0.00% 17 Missing ⚠️
kernel/src/predicates/parquet_stats_skipping.rs 0.00% 7 Missing ⚠️
kernel/src/scan/data_skipping.rs 0.00% 7 Missing ⚠️
kernel/src/lib.rs 0.00% 5 Missing ⚠️
kernel/src/engine/parquet_row_group_skipping.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #686      +/-   ##
==========================================
- Coverage   84.09%   83.37%   -0.73%     
==========================================
  Files          77       77              
  Lines       17805    17962     +157     
  Branches    17805    17962     +157     
==========================================
+ Hits        14973    14975       +2     
- Misses       2117     2271     +154     
- Partials      715      716       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot added the breaking-change Change that will require a version bump label Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Change that will require a version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant