-
Notifications
You must be signed in to change notification settings - Fork 3.9k
chore(metastore): add column reader for pointers section #19992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a31b118 to
159273e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a strong feeling it will look better in pkg/dataobj/sections/pointers than here (especially given iter.go already exists there). Yet I'd like to proceed with the modifications of readers for other sections first to better understand how to refactor them
| _ = cfg.TargetPageSize.Set("2MB") | ||
| _ = cfg.TargetObjectSize.Set("1GB") | ||
| _ = cfg.BufferSize.Set("16MB") | ||
| _ = cfg.TargetSectionSize.Set("128MB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make format made this change
| _ = cfg.TargetPageSize.Set("128KB") | ||
| _ = cfg.TargetObjectSize.Set("64MB") | ||
| _ = cfg.BufferSize.Set("2MB") | ||
| _ = cfg.TargetSectionSize.Set("16MB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make format made this change
Use columnar reader to read the stream pointers
159273e to
8190ded
Compare
| return nil | ||
| } | ||
|
|
||
| type streamSectionPointerBuilder struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need a new type here? Could you create a buf of pointers.SectionPointer instead?
| actual, err := arrowtest.TableRows(memory.DefaultAllocator, actualTable) | ||
| require.NoError(t, err) | ||
|
|
||
| expected := arrowtest.Rows{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct: A range of 25-55 should cover both path2 and path3 rows. A timestamp range should match any stream it overlaps with, it doesn't have to completely cover it
What this PR does / why we need it:
Adding columnar reader for (log)pointers section.
New reader is slower than existing RowReader due to double-translation of the data (columns->rows->rows vs columns->rows->columns->rows). The idea though is to get rid of the double-translation in the future.
Read section benchmarks
RowReader vs Reader comparison benchmarks
Checklist
CONTRIBUTING.mdguide (required)featPRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.mddeprecated-config.yamlanddeleted-config.yamlfiles respectively in thetools/deprecated-config-checkerdirectory. Example PR