Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support equality/positional deletes in vectorized arrow reader #11120

Open
1 of 3 tasks
callum-ryan opened this issue Sep 12, 2024 · 0 comments
Open
1 of 3 tasks

support equality/positional deletes in vectorized arrow reader #11120

callum-ryan opened this issue Sep 12, 2024 · 0 comments
Labels
improvement PR that improves existing functionality

Comments

@callum-ryan
Copy link

Feature Request / Improvement

when using VectorizedTableScanIterable / ArrowReader there is no support for deletes, be it equality or positional. The IcebergGenerics functionality to read with deletes "feels" far too slow, it would be beneficial to have a vectorized implementation without Spark that supports these kind of reads.

Using the DeleteFilter is easy enough to translate from ColumnarBatch to an iterable of Record (with the appropriate records removed), but I am not sure this is the most efficient way of handling deletes.

Query engine

Other

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time
@callum-ryan callum-ryan added the improvement PR that improves existing functionality label Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement PR that improves existing functionality
Projects
None yet
Development

No branches or pull requests

1 participant