-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Add a util to compute partition stats #11146
Conversation
1fb368c
to
d3982f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new approach of collecting stats per manifest and then merging them. Left a few suggestions. Nice work, @ajantha-bhat!
core/src/jmh/java/org/apache/iceberg/PartitionStatsUtilBenchmark.java
Outdated
Show resolved
Hide resolved
5c2cf66
to
9ee176b
Compare
core/src/jmh/java/org/apache/iceberg/PartitionStatsUtilBenchmark.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
core/src/jmh/java/org/apache/iceberg/PartitionStatsUtilBenchmark.java
Outdated
Show resolved
Hide resolved
core/src/jmh/java/org/apache/iceberg/PartitionStatsUtilBenchmark.java
Outdated
Show resolved
Hide resolved
6af77f9
to
561ffce
Compare
561ffce
to
bf1097d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, final minor/cosmetic comments.
@aokolnychyi: Thanks for the review and guidance. |
Thanks for the hard work, @ajantha-bhat! Great to have this in. |
Introduced two API's to compute and sort partition stats as per spec.
Iterable<PartitionStats> partitionStats = PartitionStatsUtil.computeStats(table, table.currentSnapshot());
PartitionStatsUtil.sortStats(partitionStats, Partitioning.partitionType(table));
These will be used by engines integrated with writer to compute and write the partition stats.