Skip to content

Conversation

@mhaseeb123
Copy link
Member

@mhaseeb123 mhaseeb123 commented Oct 31, 2025

Description

This PR fixes a possible OOB memory access in ORC, HYBRID_SCAN, and PARQUET kernels when trying to read an unaligned 32 or 64 bit value from the memory.

Checklist

  • Run benchmarks to see if there is any performance regression from the new cuda::std::memcpy based unaligned_load
  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mhaseeb123 mhaseeb123 requested a review from a team as a code owner October 31, 2025 23:45
@mhaseeb123 mhaseeb123 requested review from bdice and lamarrr October 31, 2025 23:45
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 31, 2025
@mhaseeb123 mhaseeb123 changed the title Fix OOB memory access in the new PQ reader's dict decoder Fix OOB memory access in ORC and the new PQ readers due to fixed-width unaligned load Oct 31, 2025
@mhaseeb123 mhaseeb123 added bug Something isn't working 3 - Ready for Review Ready for review by team cuIO cuIO issue non-breaking Non-breaking change labels Oct 31, 2025
@davidwendt
Copy link
Contributor

davidwendt commented Nov 3, 2025

This appears to fix the first error but it looks like there are still more memcheck errors

compute-sanitizer --tool memcheck gtests/HYBRID_SCAN_TEST --gtest_brief=1 --rmm_mode=cuda

Partial output:

======== COMPUTE-SANITIZER
========= Invalid __global__ read of size 4 bytes
=========     at cudf::io::parquet::detail::gpuStoreOutput(unsigned int *, const unsigned char *, unsigned int, unsigned int)+0x106c0 in page_data.cuh:74
=========     by thread (64,0,0) in block (19,0,0)
=========     Access at 0x7b586f5ba3cc is out of bounds
=========     and is inside the nearest allocation at 0x7b586f591e00 of size 165,326 bytes
=========         Device Frame: void cudf::io::parquet::detail::read_fixed_width_value_fast<unsigned int, cudf::io::parquet::detail::page_state_buffers_s<(int)256, (int)1, (int)1>>(cudf::io::parquet::detail::page_state_s *, T2 *, int, T1 *)+0x10570 in page_data.cuh:337
=========         Device Frame: void cudf::io::parquet::detail::<unnamed>::decode_fixed_width_values<(int)128, (bool)1, (cudf::io::parquet::detail::copy_mode)0, cudf::io::parquet::detail::page_state_buffers_s<(int)256, (int)1, (int)1>>(cudf::io::parquet::detail::page_state_s *, T4 *, int, int, int)+0x10560 in decode_fixed.cu:171
=========         Device Frame: auto void cudf::io::parquet::detail::<unnamed>::decode_page_data_generic<unsigned char, (int)128, (cudf::io::parquet::detail::decode_kernel_mask)8192>(cudf::io::parquet::detail::PageInfo *, cudf::device_span<const cudf::io::parquet::detail::ColumnChunkDesc, (unsigned long)18446744073709551615>, unsigned long, unsigned long, cudf::device_span<const bool, (unsigned long)18446744073709551615>, cudf::device_span<unsigned long, (unsigned long)18446744073709551615>, unsigned int *)::[lambda() (instance 1)]::operator ()<(cudf::io::parquet::detail::copy_mode)0>() const+0xfbc0 in decode_fixed.cu:1193
=========         Device Frame: void cudf::io::parquet::detail::<unnamed>::decode_page_data_generic<unsigned char, (int)128, (cudf::io::parquet::detail::decode_kernel_mask)8192>(cudf::io::parquet::detail::PageInfo *, cudf::device_span<const cudf::io::parquet::detail::ColumnChunkDesc, (unsigned long)18446744073709551615>, unsigned long, unsigned long, cudf::device_span<const bool, (unsigned long)18446744073709551615>, cudf::device_span<unsigned long, (unsigned long)18446744073709551615>, unsigned int *)+0xfbc0 in decode_fixed.cu:1199
=========     Saved host backtrace up to driver entry point at kernel launch time
=========         Host Frame: cuLaunchKernel [0x39d6c4] in libcuda.so.1
=========         Host Frame:  [0x141f8] in libcudart.so.12
=========         Host Frame: cudaLaunchKernel [0x7d09d] in libcudart.so.12
=========         Host Frame: cudf::io::parquet::detail::decode_page_data(cudf::detail::hostdevice_span<cudf::io::parquet::detail::PageInfo>, cudf::detail::hostdevice_span<cudf::io::parquet::detail::ColumnChunkDesc const>, unsigned long, unsigned long, int, cudf::io::parquet::detail::decode_kernel_mask, cudf::device_span<bool const, 18446744073709551615ul>, cudf::device_span<unsigned long, 18446744073709551615ul>, unsigned int*, rmm::cuda_stream_view) [0xfabda7] in libcudf.so
=========         Host Frame: cudf::io::parquet::detail::reader_impl::decode_page_data(cudf::io::parquet::detail::reader_impl::read_mode, unsigned long, unsigned long)::{lambda(cudf::io::parquet::detail::decode_kernel_mask)#1}::operator()(cudf::io::parquet::detail::decode_kernel_mask) const [0xf096b3] in libcudf.so
=========         Host Frame: cudf::io::parquet::detail::reader_impl::decode_page_data(cudf::io::parquet::detail::reader_impl::read_mode, unsigned long, unsigned long) [0xf1402f] in libcudf.so
=========         Host Frame: cudf::io::table_with_metadata cudf::io::parquet::experimental::detail::hybrid_scan_reader_impl::read_chunk_internal<cudf::column_view>(cudf::io::parquet::detail::reader_impl::read_mode, cudf::io::parquet::experimental::detail::hybrid_scan_reader_impl::read_columns_mode, cudf::column_view) [0xe85f42] in libcudf.so
=========         Host Frame: cudf::io::parquet::experimental::detail::hybrid_scan_reader_impl::materialize_payload_columns(cudf::host_span<std::vector<int, std::allocator<int> > const, 18446744073709551615ul>, std::vector<rmm::device_buffer, std::allocator<rmm::device_buffer> >&&, cudf::column_view const&, cudf::io::parquet::experimental::use_data_page_mask, cudf::io::parquet_reader_options const&, rmm::cuda_stream_view) [0xe86d16] in libcudf.so
=========         Host Frame: cudf::io::parquet::experimental::hybrid_scan_reader::materialize_payload_columns(cudf::host_span<int const, 18446744073709551615ul>, std::vector<rmm::device_buffer, std::allocator<rmm::device_buffer> >&&, cudf::column_view const&, cudf::io::parquet::experimental::use_data_page_mask, cudf::io::parquet_reader_options const&, rmm::cuda_stream_view) const [0xe67309] in libcudf.so
=========         Host Frame: hybrid_scan(std::vector<char, std::allocator<char> >&, cudf::ast::operation const&, int, std::optional<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, rmm::cuda_stream_view, rmm::detail::cccl_async_resource_ref<cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible> >, rmm::mr::aligned_resource_adaptor<rmm::mr::device_memory_resource>&) [0xdd76f] in HYBRID_SCAN_TEST
=========         Host Frame: void (anonymous namespace)::test_hybrid_scan<2, 20000>(std::vector<cudf::column_view, std::allocator<cudf::column_view> > const&) [0x140006] in HYBRID_SCAN_TEST
=========         Host Frame: HybridScanTest_MaterializeLists_Test::TestBody() [0x1419a3] in HYBRID_SCAN_TEST

@mhaseeb123 mhaseeb123 moved this to Burndown in libcudf Nov 3, 2025
@mhaseeb123
Copy link
Member Author

This appears to fix the first error but it looks like there are still more memcheck errors

Fixed in ea3bf21

@mhaseeb123 mhaseeb123 added 4 - Needs Review Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Nov 3, 2025
@mhaseeb123 mhaseeb123 changed the title Fix OOB memory access in ORC and the new PQ readers due to fixed-width unaligned load Fix OOB memory access in ORC, HYBRID_SCAN and PQ tests due to fixed-width unaligned load Nov 3, 2025
@mhaseeb123 mhaseeb123 changed the title Fix OOB memory access in ORC, HYBRID_SCAN and PQ tests due to fixed-width unaligned load Fix OOB memory access in Orc and Parquet stacks due to fixed-width unaligned load Nov 3, 2025
@mhaseeb123 mhaseeb123 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs Review Waiting for reviewer to review or respond labels Nov 3, 2025
}
template <typename T>
inline __device__ T WarpReduceOr16(T acc)
template <cudf::size_type size, typename T>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modernized these into one

return __shfl_xor_sync(~0, var, delta);
}

inline __device__ void syncwarp() { __syncwarp(); }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed as not being used anywhere

@mhaseeb123 mhaseeb123 changed the title Fix OOB memory access in Orc and Parquet stacks due to fixed-width unaligned load Fix OOB memory access in Orc and Parquet stacks from fixed-width unaligned loads Nov 3, 2025
return pos;
}

inline __device__ double Int128ToDouble_rn(uint64_t lo, int64_t hi)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing unused util

uint32_t v = p32[0];
return (ofs) ? __funnelshift_r(v, p32[1], ofs * 8) : v;
template <cudf::size_type size, typename T>
inline __device__ T warp_reduce_pos(T pos, uint32_t t)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, modernized into one. Thanks cursor

@mhaseeb123 mhaseeb123 requested a review from vuule November 4, 2025 01:29
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like an all-around improvement, if the performance is not negatively impacted.

@mhaseeb123 mhaseeb123 added the DO NOT MERGE Hold off on merging; see PR for details label Nov 4, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 5, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mhaseeb123 mhaseeb123 removed the DO NOT MERGE Hold off on merging; see PR for details label Nov 5, 2025
@mhaseeb123
Copy link
Member Author

mhaseeb123 commented Nov 5, 2025

Benchmarking results

Summary

The PR results in an average +0.89% regression across all the parquet reader benchmark suite.

Performance Categories

Category Count Percentage Avg % diff
FAST 15 4.12% -8.03%
SAME 265 72.80% +0.58%
SLOW 84 23.08% +3.46%
OVERALL 364 100.00% +0.89%

Hardware

RMM memory resource = pool
CUIO host memory resource = pinned_pool

* Device: `NVIDIA RTX 5880 Ada Generation`
* SM Version: 890 (PTX Version: 890)
* Number of SMs: 110
* SM Default Clock Rate: 2460 MHz
* Global Memory: 24081 MiB Free / 48632 MiB Total
* Global Memory Bus Peak: 960 GB/sec (384-bit DDR @10001MHz)
* Max Shared Memory: 100 KiB/SM, 48 KiB/Block
* L2 Cache Size: 98304 KiB
* Maximum Active Blocks: 24/SM
* Maximum Active Threads: 1536/SM, 1024/Block
* Available Registers: 65536/SM, 65536/Block
* ECC Enabled: No

Raw numbers

Click to expand
# Benchmark Results Comparison

## parquet_read_chunks

| T         | io_type       | cardinality | run_length | chunk_read_limit | data_size | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| --------- | ------------- | ----------- | ---------- | ---------------- | --------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| INTEGRAL  | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |    7.774 ms |    7.828 ms |    7.770 ms |    7.824 ms |   +0.69% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |    7.420 ms |    7.472 ms |    7.416 ms |    7.468 ms |   +0.70% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |    6.216 ms |    6.226 ms |    6.212 ms |    6.222 ms |   +0.16% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |    6.365 ms |    6.395 ms |    6.362 ms |    6.391 ms |   +0.46% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |  127.573 ms |  127.714 ms |  127.568 ms |  127.709 ms |   +0.11% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |  123.277 ms |  123.737 ms |  123.272 ms |  123.732 ms |   +0.37% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |  104.938 ms |  105.404 ms |  104.934 ms |  105.400 ms |   +0.44% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |  104.439 ms |  106.010 ms |  104.435 ms |  106.006 ms |   +1.50% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |    4.574 ms |    4.602 ms |    4.570 ms |    4.599 ms |   +0.63% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |    5.023 ms |    5.061 ms |    5.020 ms |    5.057 ms |   +0.74% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |    3.668 ms |    3.701 ms |    3.664 ms |    3.697 ms |   +0.90% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |    4.237 ms |    4.271 ms |    4.233 ms |    4.267 ms |   +0.80% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |   44.336 ms |   45.277 ms |   44.332 ms |   45.273 ms |   +2.12% | SLOW   |
| FLOAT     | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |   59.844 ms |   60.806 ms |   59.839 ms |   60.802 ms |   +1.61% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |   48.967 ms |   49.761 ms |   48.963 ms |   49.757 ms |   +1.62% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |   49.163 ms |   50.078 ms |   49.159 ms |   50.074 ms |   +1.86% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |   15.011 ms |   15.081 ms |   15.007 ms |   15.077 ms |   +0.47% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |   14.844 ms |   14.869 ms |   14.840 ms |   14.865 ms |   +0.17% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |   13.877 ms |   13.930 ms |   13.873 ms |   13.926 ms |   +0.38% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |   13.733 ms |   13.778 ms |   13.729 ms |   13.774 ms |   +0.33% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |  585.421 ms |  589.601 ms |  585.413 ms |  589.593 ms |   +0.71% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |  588.095 ms |  590.811 ms |  588.087 ms |  590.803 ms |   +0.46% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |  573.684 ms |  576.739 ms |  573.676 ms |  576.731 ms |   +0.53% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |  575.612 ms |  576.936 ms |  575.603 ms |  576.928 ms |   +0.23% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |    6.345 ms |    6.391 ms |    6.341 ms |    6.387 ms |   +0.73% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |    4.024 ms |    4.084 ms |    4.021 ms |    4.080 ms |   +1.47% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |    4.957 ms |    4.990 ms |    4.953 ms |    4.987 ms |   +0.69% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |    3.281 ms |    3.323 ms |    3.278 ms |    3.319 ms |   +1.25% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |   30.688 ms |   31.524 ms |   30.684 ms |   31.521 ms |   +2.73% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |   36.543 ms |   37.039 ms |   36.539 ms |   37.035 ms |   +1.36% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |   31.975 ms |   32.599 ms |   31.971 ms |   32.595 ms |   +1.95% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |   30.543 ms |   31.103 ms |   30.539 ms |   31.099 ms |   +1.83% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |    7.110 ms |    7.142 ms |    7.106 ms |    7.139 ms |   +0.46% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |    4.634 ms |    4.660 ms |    4.630 ms |    4.657 ms |   +0.58% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |    5.316 ms |    5.337 ms |    5.312 ms |    5.333 ms |   +0.40% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |    3.848 ms |    3.872 ms |    3.844 ms |    3.869 ms |   +0.65% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |   52.230 ms |   53.046 ms |   52.225 ms |   53.041 ms |   +1.56% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |   48.373 ms |   49.072 ms |   48.369 ms |   49.067 ms |   +1.44% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |   41.181 ms |   42.052 ms |   41.176 ms |   42.048 ms |   +2.12% | SLOW   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |   39.426 ms |   40.321 ms |   39.422 ms |   40.317 ms |   +2.27% | SLOW   |
| DURATION  | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |    5.496 ms |    5.531 ms |    5.492 ms |    5.527 ms |   +0.64% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |    4.656 ms |    4.685 ms |    4.652 ms |    4.681 ms |   +0.62% | SAME   |
| DURATION  | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |    3.329 ms |    3.351 ms |    3.326 ms |    3.347 ms |   +0.63% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |    3.848 ms |    3.885 ms |    3.844 ms |    3.881 ms |   +0.96% | SAME   |
| DURATION  | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |   48.602 ms |   49.388 ms |   48.598 ms |   49.384 ms |   +1.62% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |   48.323 ms |   49.163 ms |   48.319 ms |   49.159 ms |   +1.74% | SAME   |
| DURATION  | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |   39.602 ms |   40.471 ms |   39.597 ms |   40.467 ms |   +2.20% | SLOW   |
| DURATION  | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |   39.791 ms |   40.651 ms |   39.786 ms |   40.647 ms |   +2.16% | SLOW   |
| STRING    | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |   16.729 ms |   16.765 ms |   16.725 ms |   16.761 ms |   +0.22% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |    5.643 ms |    5.721 ms |    5.639 ms |    5.717 ms |   +1.38% | SAME   |
| STRING    | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |   16.699 ms |   16.744 ms |   16.695 ms |   16.740 ms |   +0.27% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |    5.056 ms |    5.108 ms |    5.052 ms |    5.104 ms |   +1.03% | SAME   |
| STRING    | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |  110.810 ms |  111.607 ms |  110.805 ms |  111.603 ms |   +0.72% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |   32.434 ms |   33.296 ms |   32.430 ms |   33.292 ms |   +2.66% | SLOW   |
| STRING    | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |  110.916 ms |  111.596 ms |  110.912 ms |  111.591 ms |   +0.61% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |   31.558 ms |   32.713 ms |   31.554 ms |   32.709 ms |   +3.66% | SLOW   |
| LIST      | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |   28.380 ms |   28.695 ms |   28.376 ms |   28.691 ms |   +1.11% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |   28.226 ms |   28.382 ms |   28.222 ms |   28.378 ms |   +0.55% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |   23.780 ms |   23.931 ms |   23.776 ms |   23.926 ms |   +0.63% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |   23.473 ms |   23.575 ms |   23.469 ms |   23.571 ms |   +0.43% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |  192.681 ms |  193.772 ms |  192.675 ms |  193.767 ms |   +0.57% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |  314.973 ms |  316.515 ms |  314.966 ms |  316.509 ms |   +0.49% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |  282.365 ms |  282.480 ms |  282.359 ms |  282.474 ms |   +0.04% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |  278.910 ms |  279.619 ms |  278.903 ms |  279.613 ms |   +0.25% | SAME   |
| STRUCT    | DEVICE_BUFFER |           0 |          1 |                0 | 536870912 |   19.757 ms |   20.202 ms |   19.753 ms |   20.198 ms |   +2.25% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 |                0 | 536870912 |   10.444 ms |   10.811 ms |   10.440 ms |   10.807 ms |   +3.52% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 |                0 | 536870912 |   19.732 ms |   20.295 ms |   19.728 ms |   20.292 ms |   +2.86% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 |                0 | 536870912 |    9.761 ms |   10.155 ms |    9.757 ms |   10.151 ms |   +4.04% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |          1 |           500000 | 536870912 |  128.571 ms |  131.153 ms |  128.566 ms |  131.148 ms |   +2.01% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 |           500000 | 536870912 |   71.671 ms |   75.477 ms |   71.667 ms |   75.473 ms |   +5.31% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 |           500000 | 536870912 |  128.755 ms |  131.386 ms |  128.750 ms |  131.383 ms |   +2.05% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 |           500000 | 536870912 |   71.416 ms |   74.807 ms |   71.411 ms |   74.803 ms |   +4.75% | SLOW   |

## parquet_read_column_selection

| column_selection | row_selection | str_to_categories | uses_pandas_metadata | timestamp_type | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| ---------------- | ------------- | ----------------- | -------------------- | -------------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| ALL              | ALL           | YES               | YES                  | EMPTY          |  224.437 ms |  225.003 ms |  224.432 ms |  224.996 ms |   +0.25% | SAME   |
| ALTERNATE        | ALL           | YES               | YES                  | EMPTY          |  211.776 ms |  212.190 ms |  211.771 ms |  212.184 ms |   +0.20% | SAME   |
| FIRST_HALF       | ALL           | YES               | YES                  | EMPTY          |  214.005 ms |  214.071 ms |  213.999 ms |  214.065 ms |   +0.03% | SAME   |
| SECOND_HALF      | ALL           | YES               | YES                  | EMPTY          |  210.737 ms |  210.711 ms |  210.732 ms |  210.704 ms |   -0.01% | SAME   |

## parquet_read_decode

| data_type | io_type       | cardinality | run_length | data_size | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| --------- | ------------- | ----------- | ---------- | --------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| INTEGRAL  | DEVICE_BUFFER |           0 |          1 | 536870912 |    7.502 ms |    7.585 ms |    7.498 ms |    7.582 ms |   +1.12% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |          1 | 536870912 |    7.163 ms |    7.243 ms |    7.159 ms |    7.239 ms |   +1.12% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |         32 | 536870912 |    5.930 ms |    6.009 ms |    5.926 ms |    6.006 ms |   +1.35% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |         32 | 536870912 |    6.138 ms |    6.212 ms |    6.134 ms |    6.208 ms |   +1.21% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |          1 | 536870912 |    4.441 ms |    4.461 ms |    4.437 ms |    4.457 ms |   +0.45% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |          1 | 536870912 |    4.863 ms |    4.915 ms |    4.859 ms |    4.911 ms |   +1.07% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |         32 | 536870912 |    3.517 ms |    3.543 ms |    3.513 ms |    3.539 ms |   +0.74% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |         32 | 536870912 |    4.074 ms |    4.125 ms |    4.071 ms |    4.121 ms |   +1.23% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |          1 | 536870912 |   14.365 ms |   14.493 ms |   14.361 ms |   14.489 ms |   +0.89% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |          1 | 536870912 |   14.174 ms |   14.241 ms |   14.170 ms |   14.237 ms |   +0.47% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |         32 | 536870912 |   13.231 ms |   13.311 ms |   13.227 ms |   13.307 ms |   +0.60% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |         32 | 536870912 |   13.161 ms |   13.141 ms |   13.158 ms |   13.137 ms |   -0.16% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |          1 | 536870912 |    6.251 ms |    6.267 ms |    6.247 ms |    6.263 ms |   +0.26% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |          1 | 536870912 |    3.928 ms |    3.976 ms |    3.924 ms |    3.972 ms |   +1.22% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |         32 | 536870912 |    4.864 ms |    4.879 ms |    4.861 ms |    4.875 ms |   +0.29% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |         32 | 536870912 |    3.196 ms |    3.210 ms |    3.192 ms |    3.207 ms |   +0.47% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |          1 | 536870912 |    6.962 ms |    6.983 ms |    6.958 ms |    6.979 ms |   +0.30% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |          1 | 536870912 |    4.503 ms |    4.524 ms |    4.500 ms |    4.520 ms |   +0.44% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |         32 | 536870912 |    5.173 ms |    5.210 ms |    5.170 ms |    5.206 ms |   +0.70% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |         32 | 536870912 |    3.731 ms |    3.737 ms |    3.727 ms |    3.733 ms |   +0.16% | SAME   |
| DURATION  | DEVICE_BUFFER |           0 |          1 | 536870912 |    5.360 ms |    5.397 ms |    5.357 ms |    5.393 ms |   +0.67% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |          1 | 536870912 |    4.515 ms |    4.549 ms |    4.512 ms |    4.545 ms |   +0.73% | SAME   |
| DURATION  | DEVICE_BUFFER |           0 |         32 | 536870912 |    3.209 ms |    3.218 ms |    3.206 ms |    3.214 ms |   +0.25% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |         32 | 536870912 |    3.717 ms |    3.733 ms |    3.714 ms |    3.729 ms |   +0.40% | SAME   |
| STRING    | DEVICE_BUFFER |           0 |          1 | 536870912 |   16.516 ms |   16.518 ms |   16.512 ms |   16.514 ms |   +0.01% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |          1 | 536870912 |    5.491 ms |    5.499 ms |    5.488 ms |    5.496 ms |   +0.15% | SAME   |
| STRING    | DEVICE_BUFFER |           0 |         32 | 536870912 |   16.512 ms |   16.528 ms |   16.508 ms |   16.524 ms |   +0.10% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |         32 | 536870912 |    4.882 ms |    4.902 ms |    4.878 ms |    4.898 ms |   +0.41% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |          1 | 536870912 |   26.044 ms |   26.222 ms |   26.040 ms |   26.219 ms |   +0.69% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |          1 | 536870912 |   25.296 ms |   25.375 ms |   25.292 ms |   25.371 ms |   +0.31% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |         32 | 536870912 |   21.072 ms |   21.138 ms |   21.069 ms |   21.134 ms |   +0.31% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |         32 | 536870912 |   20.720 ms |   20.801 ms |   20.716 ms |   20.797 ms |   +0.39% | SAME   |
| STRUCT    | DEVICE_BUFFER |           0 |          1 | 536870912 |   19.048 ms |   19.126 ms |   19.044 ms |   19.122 ms |   +0.41% | SAME   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 | 536870912 |    9.725 ms |    9.879 ms |    9.721 ms |    9.875 ms |   +1.58% | SAME   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 | 536870912 |   19.100 ms |   19.248 ms |   19.096 ms |   19.244 ms |   +0.78% | SAME   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 | 536870912 |    9.174 ms |    9.291 ms |    9.170 ms |    9.288 ms |   +1.29% | SAME   |

## parquet_read_fixed_width_struct

| data_type | io_type       | cardinality | run_length | data_size | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| --------- | ------------- | ----------- | ---------- | --------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| STRUCT    | DEVICE_BUFFER |           0 |          1 | 536870912 |    7.264 ms |    7.111 ms |    7.260 ms |    7.107 ms |   -2.11% | FAST   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 | 536870912 |    9.173 ms |    9.029 ms |    9.169 ms |    9.025 ms |   -1.57% | SAME   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 | 536870912 |    8.012 ms |    7.876 ms |    8.008 ms |    7.872 ms |   -1.70% | SAME   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 | 536870912 |    7.801 ms |    7.623 ms |    7.797 ms |    7.619 ms |   -2.28% | FAST   |

## parquet_read_io_compression

| io_type       | compression_type | cardinality | run_length | data_size | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| ------------- | ---------------- | ----------- | ---------- | --------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| FILEPATH      | SNAPPY           |           0 |          1 | 536870912 |  213.466 ms |  211.993 ms |  213.460 ms |  211.987 ms |   -0.69% | SAME   |
| HOST_BUFFER   | SNAPPY           |           0 |          1 | 536870912 |  215.673 ms |  215.514 ms |  215.667 ms |  215.508 ms |   -0.07% | SAME   |
| DEVICE_BUFFER | SNAPPY           |           0 |          1 | 536870912 |  173.021 ms |  172.636 ms |  173.015 ms |  172.631 ms |   -0.22% | SAME   |
| FILEPATH      | ZSTD             |           0 |          1 | 536870912 |  190.745 ms |  191.857 ms |  190.739 ms |  191.850 ms |   +0.58% | SAME   |
| HOST_BUFFER   | ZSTD             |           0 |          1 | 536870912 |  189.449 ms |  189.850 ms |  189.443 ms |  189.845 ms |   +0.21% | SAME   |
| DEVICE_BUFFER | ZSTD             |           0 |          1 | 536870912 |  149.283 ms |  149.642 ms |  149.277 ms |  149.637 ms |   +0.24% | SAME   |
| FILEPATH      | NONE             |           0 |          1 | 536870912 |  165.640 ms |  167.157 ms |  165.634 ms |  167.151 ms |   +0.92% | SAME   |
| HOST_BUFFER   | NONE             |           0 |          1 | 536870912 |  164.067 ms |  165.248 ms |  164.061 ms |  165.243 ms |   +0.72% | SAME   |
| DEVICE_BUFFER | NONE             |           0 |          1 | 536870912 |  122.526 ms |  122.816 ms |  122.521 ms |  122.811 ms |   +0.24% | SAME   |
| FILEPATH      | SNAPPY           |        1000 |          1 | 536870912 |  188.168 ms |  189.255 ms |  188.162 ms |  189.250 ms |   +0.58% | SAME   |
| HOST_BUFFER   | SNAPPY           |        1000 |          1 | 536870912 |  186.851 ms |  187.749 ms |  186.845 ms |  187.744 ms |   +0.48% | SAME   |
| DEVICE_BUFFER | SNAPPY           |        1000 |          1 | 536870912 |  177.609 ms |  177.608 ms |  177.603 ms |  177.603 ms |   +0.00% | SAME   |
| FILEPATH      | ZSTD             |        1000 |          1 | 536870912 |  186.705 ms |  187.189 ms |  186.699 ms |  187.184 ms |   +0.26% | SAME   |
| HOST_BUFFER   | ZSTD             |        1000 |          1 | 536870912 |  185.540 ms |  186.824 ms |  185.534 ms |  186.818 ms |   +0.69% | SAME   |
| DEVICE_BUFFER | ZSTD             |        1000 |          1 | 536870912 |  176.931 ms |  176.306 ms |  176.926 ms |  176.301 ms |   -0.35% | SAME   |
| FILEPATH      | NONE             |        1000 |          1 | 536870912 |  172.820 ms |  172.771 ms |  172.814 ms |  172.765 ms |   -0.03% | SAME   |
| HOST_BUFFER   | NONE             |        1000 |          1 | 536870912 |  171.718 ms |  171.945 ms |  171.712 ms |  171.939 ms |   +0.13% | SAME   |
| DEVICE_BUFFER | NONE             |        1000 |          1 | 536870912 |  160.688 ms |  160.949 ms |  160.683 ms |  160.944 ms |   +0.16% | SAME   |
| FILEPATH      | SNAPPY           |           0 |         32 | 536870912 |  187.410 ms |  187.456 ms |  187.404 ms |  187.450 ms |   +0.02% | SAME   |
| HOST_BUFFER   | SNAPPY           |           0 |         32 | 536870912 |  186.032 ms |  186.727 ms |  186.027 ms |  186.722 ms |   +0.37% | SAME   |
| DEVICE_BUFFER | SNAPPY           |           0 |         32 | 536870912 |  182.707 ms |  183.166 ms |  182.701 ms |  183.161 ms |   +0.25% | SAME   |
| FILEPATH      | ZSTD             |           0 |         32 | 536870912 |  163.639 ms |  164.304 ms |  163.634 ms |  164.298 ms |   +0.41% | SAME   |
| HOST_BUFFER   | ZSTD             |           0 |         32 | 536870912 |  162.284 ms |  162.803 ms |  162.278 ms |  162.797 ms |   +0.32% | SAME   |
| DEVICE_BUFFER | ZSTD             |           0 |         32 | 536870912 |  159.758 ms |  160.238 ms |  159.752 ms |  160.233 ms |   +0.30% | SAME   |
| FILEPATH      | NONE             |           0 |         32 | 536870912 |  151.367 ms |  150.177 ms |  151.361 ms |  150.171 ms |   -0.79% | SAME   |
| HOST_BUFFER   | NONE             |           0 |         32 | 536870912 |  149.400 ms |  150.665 ms |  149.395 ms |  150.660 ms |   +0.85% | SAME   |
| DEVICE_BUFFER | NONE             |           0 |         32 | 536870912 |  112.280 ms |  113.217 ms |  112.275 ms |  113.212 ms |   +0.83% | SAME   |
| FILEPATH      | SNAPPY           |        1000 |         32 | 536870912 |  136.292 ms |  137.662 ms |  136.286 ms |  137.657 ms |   +1.01% | SAME   |
| HOST_BUFFER   | SNAPPY           |        1000 |         32 | 536870912 |  136.091 ms |  136.482 ms |  136.086 ms |  136.477 ms |   +0.29% | SAME   |
| DEVICE_BUFFER | SNAPPY           |        1000 |         32 | 536870912 |  134.755 ms |  134.814 ms |  134.750 ms |  134.809 ms |   +0.04% | SAME   |
| FILEPATH      | ZSTD             |        1000 |         32 | 536870912 |  142.284 ms |  142.968 ms |  142.279 ms |  142.962 ms |   +0.48% | SAME   |
| HOST_BUFFER   | ZSTD             |        1000 |         32 | 536870912 |  141.484 ms |  142.126 ms |  141.479 ms |  142.121 ms |   +0.45% | SAME   |
| DEVICE_BUFFER | ZSTD             |        1000 |         32 | 536870912 |  140.358 ms |  140.746 ms |  140.353 ms |  140.741 ms |   +0.28% | SAME   |
| FILEPATH      | NONE             |        1000 |         32 | 536870912 |  135.637 ms |  135.514 ms |  135.632 ms |  135.509 ms |   -0.09% | SAME   |
| HOST_BUFFER   | NONE             |        1000 |         32 | 536870912 |  134.483 ms |  134.576 ms |  134.478 ms |  134.571 ms |   +0.07% | SAME   |
| DEVICE_BUFFER | NONE             |        1000 |         32 | 536870912 |  130.263 ms |  130.988 ms |  130.258 ms |  130.983 ms |   +0.56% | SAME   |

## parquet_read_io_small_mixed

| io_type  | cardinality | run_length | num_string_cols | data_size | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| -------- | ----------- | ---------- | --------------- | --------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| FILEPATH |           0 |          1 |               1 | 536870912 |    4.063 ms |    4.052 ms |    4.059 ms |    4.048 ms |   -0.27% | SAME   |
| FILEPATH |        1000 |          1 |               1 | 536870912 |    1.902 ms |    1.390 ms |    1.898 ms |    1.386 ms |  -26.98% | FAST   |
| FILEPATH |           0 |         32 |               1 | 536870912 |    3.871 ms |    3.162 ms |    3.866 ms |    3.158 ms |  -18.31% | FAST   |
| FILEPATH |        1000 |         32 |               1 | 536870912 |    1.733 ms |    1.255 ms |    1.728 ms |    1.251 ms |  -27.60% | FAST   |
| FILEPATH |           0 |          1 |               2 | 536870912 |    4.370 ms |    4.246 ms |    4.365 ms |    4.241 ms |   -2.84% | FAST   |
| FILEPATH |        1000 |          1 |               2 | 536870912 |    1.621 ms |    1.399 ms |    1.618 ms |    1.395 ms |  -13.78% | FAST   |
| FILEPATH |           0 |         32 |               2 | 536870912 |    3.911 ms |    4.153 ms |    3.907 ms |    4.149 ms |   +6.19% | SLOW   |
| FILEPATH |        1000 |         32 |               2 | 536870912 |    1.271 ms |    1.474 ms |    1.267 ms |    1.470 ms |  +16.02% | SLOW   |
| FILEPATH |           0 |          1 |               3 | 536870912 |    4.491 ms |    4.493 ms |    4.487 ms |    4.489 ms |   +0.04% | SAME   |
| FILEPATH |        1000 |          1 |               3 | 536870912 |    1.749 ms |    1.939 ms |    1.745 ms |    1.935 ms |  +10.89% | SLOW   |
| FILEPATH |           0 |         32 |               3 | 536870912 |    4.428 ms |    4.462 ms |    4.424 ms |    4.457 ms |   +0.75% | SAME   |
| FILEPATH |        1000 |         32 |               3 | 536870912 |    1.749 ms |    1.758 ms |    1.745 ms |    1.753 ms |   +0.46% | SAME   |

## parquet_read_long_strings

| io_type       | cardinality | data_size | avg_string_length | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| ------------- | ----------- | --------- | ----------------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| DEVICE_BUFFER |           0 | 536870912 | 2^4 = 16          |   18.226 ms |   18.102 ms |   18.222 ms |   18.098 ms |   -0.68% | SAME   |
| DEVICE_BUFFER |        1000 | 536870912 | 2^4 = 16          |    6.881 ms |    6.727 ms |    6.876 ms |    6.723 ms |   -2.23% | FAST   |
| DEVICE_BUFFER |           0 | 536870912 | 2^6 = 64          |   13.710 ms |   13.608 ms |   13.705 ms |   13.604 ms |   -0.74% | SAME   |
| DEVICE_BUFFER |        1000 | 536870912 | 2^6 = 64          |    5.411 ms |    5.313 ms |    5.407 ms |    5.309 ms |   -1.81% | SAME   |
| DEVICE_BUFFER |           0 | 536870912 | 2^8 = 256         |    9.217 ms |    9.342 ms |    9.214 ms |    9.338 ms |   +1.35% | SAME   |
| DEVICE_BUFFER |        1000 | 536870912 | 2^8 = 256         |    7.531 ms |    7.516 ms |    7.527 ms |    7.512 ms |   -0.20% | SAME   |
| DEVICE_BUFFER |           0 | 536870912 | 2^10 = 1024       |    8.677 ms |    8.708 ms |    8.673 ms |    8.704 ms |   +0.36% | SAME   |
| DEVICE_BUFFER |        1000 | 536870912 | 2^10 = 1024       |   11.180 ms |   11.152 ms |   11.176 ms |   11.147 ms |   -0.26% | SAME   |
| DEVICE_BUFFER |           0 | 536870912 | 2^12 = 4096       |    8.413 ms |    8.391 ms |    8.409 ms |    8.387 ms |   -0.26% | SAME   |
| DEVICE_BUFFER |        1000 | 536870912 | 2^12 = 4096       |    8.436 ms |    8.413 ms |    8.432 ms |    8.410 ms |   -0.26% | SAME   |
| DEVICE_BUFFER |           0 | 536870912 | 2^14 = 16384      |    8.398 ms |    8.264 ms |    8.394 ms |    8.260 ms |   -1.60% | SAME   |
| DEVICE_BUFFER |        1000 | 536870912 | 2^14 = 16384      |    8.345 ms |    8.229 ms |    8.342 ms |    8.225 ms |   -1.40% | SAME   |
| DEVICE_BUFFER |           0 | 536870912 | 2^16 = 65536      |    9.409 ms |    9.228 ms |    9.405 ms |    9.224 ms |   -1.92% | SAME   |
| DEVICE_BUFFER |        1000 | 536870912 | 2^16 = 65536      |    9.460 ms |    9.399 ms |    9.456 ms |    9.395 ms |   -0.65% | SAME   |

## parquet_read_misc_options

| column_selection | row_selection | str_to_categories | uses_pandas_metadata | timestamp_type | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| ---------------- | ------------- | ----------------- | -------------------- | -------------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| ALL              | ALL           | YES               | YES                  | EMPTY          |  224.786 ms |  224.316 ms |  224.780 ms |  224.310 ms |   -0.21% | SAME   |
| ALL              | ALL           | YES               | NO                   | EMPTY          |  224.779 ms |  224.302 ms |  224.773 ms |  224.295 ms |   -0.21% | SAME   |
| ALL              | ALL           | NO                | YES                  | EMPTY          |  229.851 ms |  229.265 ms |  229.845 ms |  229.258 ms |   -0.26% | SAME   |
| ALL              | ALL           | NO                | NO                   | EMPTY          |  229.767 ms |  230.035 ms |  229.761 ms |  230.028 ms |   +0.12% | SAME   |

## parquet_read_row_selection

| column_selection | row_selection | str_to_categories | uses_pandas_metadata | timestamp_type | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| ---------------- | ------------- | ----------------- | -------------------- | -------------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| ALL              | ALL           | YES               | YES                  | EMPTY          |  224.934 ms |  224.844 ms |  224.928 ms |  224.838 ms |   -0.04% | SAME   |
| ALL              | NROWS         | YES               | YES                  | EMPTY          |  463.996 ms |  467.241 ms |  463.989 ms |  467.232 ms |   +0.70% | SAME   |

## parquet_read_subrowgroup_chunks

| T         | io_type       | cardinality | run_length | chunk_read_limit | pass_read_limit | data_size | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| --------- | ------------- | ----------- | ---------- | ---------------- | --------------- | --------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| INTEGRAL  | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |    7.779 ms |    7.849 ms |    7.775 ms |    7.845 ms |   +0.90% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |    7.444 ms |    7.496 ms |    7.440 ms |    7.492 ms |   +0.70% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |    6.220 ms |    6.267 ms |    6.215 ms |    6.264 ms |   +0.79% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |    6.386 ms |    6.437 ms |    6.383 ms |    6.433 ms |   +0.78% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |  128.246 ms |  129.069 ms |  128.241 ms |  129.064 ms |   +0.64% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |  124.076 ms |  124.476 ms |  124.071 ms |  124.471 ms |   +0.32% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |  105.713 ms |  106.539 ms |  105.709 ms |  106.535 ms |   +0.78% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |  104.998 ms |  105.913 ms |  104.993 ms |  105.909 ms |   +0.87% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  146.905 ms |  148.910 ms |  146.900 ms |  148.905 ms |   +1.36% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |  136.736 ms |  139.692 ms |  136.731 ms |  139.687 ms |   +2.16% | SLOW   |
| INTEGRAL  | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |   43.822 ms |   44.889 ms |   43.818 ms |   44.885 ms |   +2.44% | SLOW   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |   52.001 ms |   52.916 ms |   51.997 ms |   52.912 ms |   +1.76% | SAME   |
| INTEGRAL  | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  157.730 ms |  162.907 ms |  157.724 ms |  162.902 ms |   +3.28% | SLOW   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |  147.139 ms |  151.727 ms |  147.134 ms |  151.722 ms |   +3.12% | SLOW   |
| INTEGRAL  | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |   88.511 ms |   92.263 ms |   88.506 ms |   92.259 ms |   +4.24% | SLOW   |
| INTEGRAL  | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |   95.650 ms |   99.776 ms |   95.645 ms |   99.771 ms |   +4.31% | SLOW   |
| FLOAT     | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |    4.605 ms |    4.629 ms |    4.601 ms |    4.625 ms |   +0.52% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |    5.056 ms |    5.092 ms |    5.053 ms |    5.088 ms |   +0.69% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |    3.691 ms |    3.713 ms |    3.688 ms |    3.709 ms |   +0.57% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |    4.274 ms |    4.298 ms |    4.270 ms |    4.294 ms |   +0.56% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |   44.630 ms |   45.456 ms |   44.626 ms |   45.452 ms |   +1.85% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |   60.259 ms |   61.082 ms |   60.255 ms |   61.078 ms |   +1.37% | SAME   |
| FLOAT     | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |   49.161 ms |   49.968 ms |   49.157 ms |   49.964 ms |   +1.64% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |   49.521 ms |   50.565 ms |   49.517 ms |   50.561 ms |   +2.11% | SLOW   |
| FLOAT     | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |    5.331 ms |    5.346 ms |    5.328 ms |    5.342 ms |   +0.26% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |   77.648 ms |   79.753 ms |   77.644 ms |   79.749 ms |   +2.71% | SLOW   |
| FLOAT     | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |    4.456 ms |    4.492 ms |    4.452 ms |    4.489 ms |   +0.83% | SAME   |
| FLOAT     | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |   39.386 ms |   40.815 ms |   39.382 ms |   40.811 ms |   +3.63% | SLOW   |
| FLOAT     | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |   39.703 ms |   40.748 ms |   39.699 ms |   40.744 ms |   +2.63% | SLOW   |
| FLOAT     | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |   83.740 ms |   86.442 ms |   83.735 ms |   86.438 ms |   +3.23% | SLOW   |
| FLOAT     | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |   43.543 ms |   44.730 ms |   43.539 ms |   44.726 ms |   +2.73% | SLOW   |
| FLOAT     | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |   60.037 ms |   62.116 ms |   60.033 ms |   62.112 ms |   +3.46% | SLOW   |
| BOOL8     | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |   15.043 ms |   15.158 ms |   15.039 ms |   15.154 ms |   +0.76% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |   14.854 ms |   14.950 ms |   14.850 ms |   14.946 ms |   +0.65% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |   13.935 ms |   14.025 ms |   13.932 ms |   14.021 ms |   +0.64% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |   13.776 ms |   13.751 ms |   13.772 ms |   13.748 ms |   -0.17% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |  585.410 ms |  590.025 ms |  585.402 ms |  590.018 ms |   +0.79% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |  590.271 ms |  592.537 ms |  590.262 ms |  592.529 ms |   +0.38% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |  575.626 ms |  577.371 ms |  575.617 ms |  577.363 ms |   +0.30% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |  574.516 ms |  571.128 ms |  574.508 ms |  571.120 ms |   -0.59% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  339.427 ms |  345.443 ms |  339.421 ms |  345.436 ms |   +1.77% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |  342.982 ms |  349.714 ms |  342.975 ms |  349.708 ms |   +1.96% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |  192.909 ms |  196.290 ms |  192.904 ms |  196.285 ms |   +1.75% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |  193.054 ms |  193.840 ms |  193.049 ms |  193.835 ms |   +0.41% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  373.484 ms |  375.829 ms |  373.477 ms |  375.823 ms |   +0.63% | SAME   |
| BOOL8     | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |  379.699 ms |  383.040 ms |  379.692 ms |  383.032 ms |   +0.88% | SAME   |
| BOOL8     | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |  279.675 ms |  286.874 ms |  279.669 ms |  286.868 ms |   +2.57% | SLOW   |
| BOOL8     | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |  279.722 ms |  286.709 ms |  279.716 ms |  286.703 ms |   +2.50% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |    6.350 ms |    6.383 ms |    6.346 ms |    6.379 ms |   +0.52% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |    4.040 ms |    4.081 ms |    4.036 ms |    4.077 ms |   +1.02% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |    4.952 ms |    4.999 ms |    4.948 ms |    4.995 ms |   +0.95% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |    3.289 ms |    3.320 ms |    3.285 ms |    3.316 ms |   +0.94% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |   30.668 ms |   31.419 ms |   30.663 ms |   31.415 ms |   +2.45% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |   36.536 ms |   36.761 ms |   36.532 ms |   36.757 ms |   +0.62% | SAME   |
| DECIMAL   | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |   31.995 ms |   32.498 ms |   31.991 ms |   32.493 ms |   +1.57% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |   30.502 ms |   31.125 ms |   30.498 ms |   31.121 ms |   +2.04% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  116.575 ms |  118.363 ms |  116.570 ms |  118.357 ms |   +1.53% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |   53.893 ms |   55.149 ms |   53.889 ms |   55.144 ms |   +2.33% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |   25.196 ms |   25.961 ms |   25.192 ms |   25.957 ms |   +3.04% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |   26.455 ms |   27.421 ms |   26.452 ms |   27.417 ms |   +3.65% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  120.262 ms |  121.838 ms |  120.257 ms |  121.832 ms |   +1.31% | SAME   |
| DECIMAL   | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |   57.704 ms |   59.413 ms |   57.699 ms |   59.409 ms |   +2.96% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |   39.151 ms |   41.083 ms |   39.147 ms |   41.079 ms |   +4.94% | SLOW   |
| DECIMAL   | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |   40.798 ms |   42.464 ms |   40.794 ms |   42.459 ms |   +4.08% | SLOW   |
| TIMESTAMP | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |    7.092 ms |    7.139 ms |    7.088 ms |    7.135 ms |   +0.66% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |    4.620 ms |    4.671 ms |    4.616 ms |    4.667 ms |   +1.10% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |    5.307 ms |    5.339 ms |    5.303 ms |    5.335 ms |   +0.60% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |    3.856 ms |    3.875 ms |    3.852 ms |    3.871 ms |   +0.49% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |   52.495 ms |   53.057 ms |   52.490 ms |   53.052 ms |   +1.07% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |   48.417 ms |   49.186 ms |   48.412 ms |   49.181 ms |   +1.59% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |   41.457 ms |   42.058 ms |   41.453 ms |   42.053 ms |   +1.45% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |   39.720 ms |   40.360 ms |   39.716 ms |   40.356 ms |   +1.61% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  140.173 ms |  141.909 ms |  140.167 ms |  141.903 ms |   +1.24% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |   67.338 ms |   69.439 ms |   67.333 ms |   69.434 ms |   +3.12% | SLOW   |
| TIMESTAMP | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |   32.735 ms |   33.675 ms |   32.731 ms |   33.671 ms |   +2.87% | SLOW   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |   33.270 ms |   34.314 ms |   33.266 ms |   34.310 ms |   +3.14% | SLOW   |
| TIMESTAMP | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  145.244 ms |  147.090 ms |  145.239 ms |  147.084 ms |   +1.27% | SAME   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |   72.418 ms |   74.754 ms |   72.414 ms |   74.749 ms |   +3.22% | SLOW   |
| TIMESTAMP | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |   50.118 ms |   52.582 ms |   50.113 ms |   52.577 ms |   +4.92% | SLOW   |
| TIMESTAMP | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |   50.756 ms |   52.392 ms |   50.751 ms |   52.388 ms |   +3.23% | SLOW   |
| DURATION  | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |    5.494 ms |    5.532 ms |    5.490 ms |    5.528 ms |   +0.69% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |    4.654 ms |    4.690 ms |    4.651 ms |    4.686 ms |   +0.75% | SAME   |
| DURATION  | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |    3.337 ms |    3.352 ms |    3.333 ms |    3.348 ms |   +0.45% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |    3.854 ms |    3.874 ms |    3.850 ms |    3.871 ms |   +0.55% | SAME   |
| DURATION  | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |   48.430 ms |   49.138 ms |   48.425 ms |   49.134 ms |   +1.46% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |   48.174 ms |   49.148 ms |   48.170 ms |   49.143 ms |   +2.02% | SLOW   |
| DURATION  | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |   39.422 ms |   40.152 ms |   39.418 ms |   40.148 ms |   +1.85% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |   39.642 ms |   40.493 ms |   39.638 ms |   40.489 ms |   +2.15% | SLOW   |
| DURATION  | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  108.266 ms |  110.429 ms |  108.261 ms |  110.425 ms |   +2.00% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |   66.847 ms |   68.688 ms |   66.843 ms |   68.683 ms |   +2.75% | SLOW   |
| DURATION  | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |    3.985 ms |    4.053 ms |    3.981 ms |    4.049 ms |   +1.71% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |   33.125 ms |   34.152 ms |   33.121 ms |   34.147 ms |   +3.10% | SLOW   |
| DURATION  | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  113.710 ms |  115.893 ms |  113.705 ms |  115.887 ms |   +1.92% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |   72.185 ms |   73.843 ms |   72.180 ms |   73.838 ms |   +2.30% | SLOW   |
| DURATION  | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |   37.469 ms |   38.178 ms |   37.465 ms |   38.174 ms |   +1.89% | SAME   |
| DURATION  | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |   50.611 ms |   52.559 ms |   50.607 ms |   52.554 ms |   +3.85% | SLOW   |
| STRING    | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |   16.719 ms |   16.747 ms |   16.715 ms |   16.743 ms |   +0.17% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |    5.650 ms |    5.688 ms |    5.646 ms |    5.684 ms |   +0.67% | SAME   |
| STRING    | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |   16.727 ms |   16.723 ms |   16.723 ms |   16.719 ms |   -0.02% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |    5.070 ms |    5.105 ms |    5.067 ms |    5.101 ms |   +0.67% | SAME   |
| STRING    | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |  110.894 ms |  111.408 ms |  110.889 ms |  111.403 ms |   +0.46% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |   32.591 ms |   33.310 ms |   32.587 ms |   33.306 ms |   +2.21% | SLOW   |
| STRING    | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |  110.942 ms |  111.465 ms |  110.937 ms |  111.460 ms |   +0.47% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |   31.632 ms |   32.774 ms |   31.628 ms |   32.770 ms |   +3.61% | SLOW   |
| STRING    | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  133.272 ms |  134.708 ms |  133.267 ms |  134.702 ms |   +1.08% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |   38.768 ms |   39.821 ms |   38.764 ms |   39.816 ms |   +2.71% | SLOW   |
| STRING    | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |  133.195 ms |  134.750 ms |  133.190 ms |  134.745 ms |   +1.17% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |   18.883 ms |   19.569 ms |   18.879 ms |   19.565 ms |   +3.63% | SLOW   |
| STRING    | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  136.446 ms |  137.858 ms |  136.441 ms |  137.852 ms |   +1.03% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |   42.676 ms |   43.226 ms |   42.672 ms |   43.222 ms |   +1.29% | SAME   |
| STRING    | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |  136.448 ms |  137.798 ms |  136.443 ms |  137.792 ms |   +0.99% | SAME   |
| STRING    | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |   35.556 ms |   37.119 ms |   35.552 ms |   37.114 ms |   +4.39% | SLOW   |
| LIST      | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |   28.690 ms |   28.878 ms |   28.686 ms |   28.874 ms |   +0.66% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |   28.308 ms |   28.512 ms |   28.304 ms |   28.508 ms |   +0.72% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |   23.853 ms |   23.998 ms |   23.849 ms |   23.994 ms |   +0.61% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |   23.549 ms |   23.675 ms |   23.545 ms |   23.671 ms |   +0.54% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |  192.362 ms |  194.120 ms |  192.356 ms |  194.114 ms |   +0.91% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |  314.779 ms |  316.784 ms |  314.773 ms |  316.777 ms |   +0.64% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |  281.928 ms |  283.313 ms |  281.922 ms |  283.306 ms |   +0.49% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |  278.706 ms |  280.178 ms |  278.700 ms |  280.170 ms |   +0.53% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  277.785 ms |  280.709 ms |  277.778 ms |  280.701 ms |   +1.05% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |  470.977 ms |  474.679 ms |  470.970 ms |  474.670 ms |   +0.79% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |  387.035 ms |  389.955 ms |  387.027 ms |  389.946 ms |   +0.75% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |  388.914 ms |  391.547 ms |  388.908 ms |  391.538 ms |   +0.68% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  381.388 ms |  385.949 ms |  381.381 ms |  385.940 ms |   +1.20% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |  693.945 ms |  697.899 ms |  693.935 ms |  697.887 ms |   +0.57% | SAME   |
| LIST      | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |  594.393 ms |  600.026 ms |  594.384 ms |  600.014 ms |   +0.95% | SAME   |
| LIST      | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |  596.437 ms |  601.418 ms |  596.428 ms |  601.406 ms |   +0.83% | SAME   |
| STRUCT    | DEVICE_BUFFER |           0 |          1 |                0 |               0 | 536870912 |   19.775 ms |   20.327 ms |   19.771 ms |   20.323 ms |   +2.79% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 |                0 |               0 | 536870912 |   10.486 ms |   10.922 ms |   10.482 ms |   10.918 ms |   +4.16% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 |                0 |               0 | 536870912 |   19.731 ms |   20.214 ms |   19.727 ms |   20.210 ms |   +2.45% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 |                0 |               0 | 536870912 |    9.789 ms |   10.195 ms |    9.785 ms |   10.191 ms |   +4.15% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |          1 |           500000 |               0 | 536870912 |  129.023 ms |  131.627 ms |  129.018 ms |  131.622 ms |   +2.02% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 |           500000 |               0 | 536870912 |   71.832 ms |   75.871 ms |   71.828 ms |   75.866 ms |   +5.62% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 |           500000 |               0 | 536870912 |  129.123 ms |  132.019 ms |  129.119 ms |  132.013 ms |   +2.24% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 |           500000 |               0 | 536870912 |   71.910 ms |   75.570 ms |   71.906 ms |   75.565 ms |   +5.09% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |          1 |                0 |          500000 | 536870912 |  150.049 ms |  150.767 ms |  150.045 ms |  150.761 ms |   +0.48% | SAME   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 |                0 |          500000 | 536870912 |   82.008 ms |   84.673 ms |   82.004 ms |   84.668 ms |   +3.25% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 |                0 |          500000 | 536870912 |  150.836 ms |  152.934 ms |  150.831 ms |  152.929 ms |   +1.39% | SAME   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 |                0 |          500000 | 536870912 |   78.656 ms |   81.882 ms |   78.651 ms |   81.877 ms |   +4.10% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |          1 |           500000 |          500000 | 536870912 |  152.897 ms |  153.497 ms |  152.892 ms |  153.491 ms |   +0.39% | SAME   |
| STRUCT    | DEVICE_BUFFER |        1000 |          1 |           500000 |          500000 | 536870912 |   85.129 ms |   88.423 ms |   85.124 ms |   88.417 ms |   +3.87% | SLOW   |
| STRUCT    | DEVICE_BUFFER |           0 |         32 |           500000 |          500000 | 536870912 |  153.369 ms |  157.598 ms |  153.364 ms |  157.592 ms |   +2.76% | SLOW   |
| STRUCT    | DEVICE_BUFFER |        1000 |         32 |           500000 |          500000 | 536870912 |   83.488 ms |   88.761 ms |   83.484 ms |   88.755 ms |   +6.31% | SLOW   |

## parquet_read_wide_tables

| data_type | data_size  | num_cols | cardinality | run_length | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| --------- | ---------- | -------- | ----------- | ---------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| DECIMAL   | 1073741824 |      256 |           0 |          1 |    7.782 ms |    7.847 ms |    7.778 ms |    7.843 ms |   +0.84% | SAME   |
| DECIMAL   | 1073741824 |      512 |           0 |          1 |    9.243 ms |    9.483 ms |    9.239 ms |    9.479 ms |   +2.60% | SLOW   |
| DECIMAL   | 1073741824 |     1024 |           0 |          1 |   11.455 ms |   11.840 ms |   11.451 ms |   11.836 ms |   +3.36% | SLOW   |
| DECIMAL   | 1073741824 |      256 |        1000 |          1 |    6.669 ms |    6.787 ms |    6.665 ms |    6.783 ms |   +1.77% | SAME   |
| DECIMAL   | 1073741824 |      512 |        1000 |          1 |    8.474 ms |    8.681 ms |    8.470 ms |    8.677 ms |   +2.44% | SLOW   |
| DECIMAL   | 1073741824 |     1024 |        1000 |          1 |   10.507 ms |   10.831 ms |   10.503 ms |   10.827 ms |   +3.08% | SLOW   |
| DECIMAL   | 1073741824 |      256 |           0 |         32 |    5.210 ms |    5.299 ms |    5.206 ms |    5.295 ms |   +1.71% | SAME   |
| DECIMAL   | 1073741824 |      512 |           0 |         32 |    6.917 ms |    7.179 ms |    6.913 ms |    7.175 ms |   +3.79% | SLOW   |
| DECIMAL   | 1073741824 |     1024 |           0 |         32 |    8.978 ms |    9.327 ms |    8.974 ms |    9.323 ms |   +3.89% | SLOW   |
| DECIMAL   | 1073741824 |      256 |        1000 |         32 |    5.028 ms |    5.138 ms |    5.024 ms |    5.135 ms |   +2.21% | SLOW   |
| DECIMAL   | 1073741824 |      512 |        1000 |         32 |    6.789 ms |    7.032 ms |    6.785 ms |    7.028 ms |   +3.58% | SLOW   |
| DECIMAL   | 1073741824 |     1024 |        1000 |         32 |    8.783 ms |    9.144 ms |    8.779 ms |    9.140 ms |   +4.11% | SLOW   |
| STRING    | 1073741824 |      256 |           0 |          1 |   24.013 ms |   24.093 ms |   24.009 ms |   24.089 ms |   +0.33% | SAME   |
| STRING    | 1073741824 |      512 |           0 |          1 |   25.888 ms |   26.011 ms |   25.884 ms |   26.007 ms |   +0.48% | SAME   |
| STRING    | 1073741824 |     1024 |           0 |          1 |   28.137 ms |   28.509 ms |   28.133 ms |   28.504 ms |   +1.32% | SAME   |
| STRING    | 1073741824 |      256 |        1000 |          1 |    9.079 ms |    9.181 ms |    9.075 ms |    9.177 ms |   +1.12% | SAME   |
| STRING    | 1073741824 |      512 |        1000 |          1 |   10.932 ms |   10.833 ms |   10.928 ms |   10.829 ms |   -0.91% | SAME   |
| STRING    | 1073741824 |     1024 |        1000 |          1 |   13.646 ms |   14.008 ms |   13.642 ms |   14.005 ms |   +2.66% | SLOW   |
| STRING    | 1073741824 |      256 |           0 |         32 |   24.009 ms |   24.296 ms |   24.004 ms |   24.291 ms |   +1.20% | SAME   |
| STRING    | 1073741824 |      512 |           0 |         32 |   25.949 ms |   26.114 ms |   25.945 ms |   26.110 ms |   +0.64% | SAME   |
| STRING    | 1073741824 |     1024 |           0 |         32 |   28.346 ms |   28.427 ms |   28.342 ms |   28.422 ms |   +0.28% | SAME   |
| STRING    | 1073741824 |      256 |        1000 |         32 |    8.272 ms |    8.393 ms |    8.269 ms |    8.389 ms |   +1.45% | SAME   |
| STRING    | 1073741824 |      512 |        1000 |         32 |   10.149 ms |   10.414 ms |   10.145 ms |   10.410 ms |   +2.61% | SLOW   |
| STRING    | 1073741824 |     1024 |        1000 |         32 |   12.829 ms |   13.306 ms |   12.825 ms |   13.302 ms |   +3.72% | SLOW   |


## parquet_read_wide_tables_mixed

| data_size  | num_cols | cardinality | run_length | CPUTime_old | CPUTime_new | GPUTime_old | GPUTime_new | PCT_DIFF | STATUS |
| ---------- | -------- | ----------- | ---------- | ----------- | ----------- | ----------- | ----------- | -------- | ------ |
| 1073741824 |      256 |           0 |          1 |   12.207 ms |   12.124 ms |   12.203 ms |   12.120 ms |   -0.68% | SAME   |
| 1073741824 |      512 |           0 |          1 |   13.375 ms |   13.268 ms |   13.371 ms |   13.264 ms |   -0.80% | SAME   |
| 1073741824 |     1024 |           0 |          1 |   15.768 ms |   15.231 ms |   15.764 ms |   15.228 ms |   -3.40% | FAST   |
| 1073741824 |      256 |        1000 |          1 |   11.746 ms |   11.653 ms |   11.742 ms |   11.649 ms |   -0.79% | SAME   |
| 1073741824 |      512 |        1000 |          1 |   13.069 ms |   12.779 ms |   13.065 ms |   12.775 ms |   -2.22% | FAST   |
| 1073741824 |     1024 |        1000 |          1 |   15.353 ms |   14.820 ms |   15.349 ms |   14.816 ms |   -3.47% | FAST   |
| 1073741824 |      256 |           0 |         32 |    9.340 ms |    9.245 ms |    9.336 ms |    9.241 ms |   -1.02% | SAME   |
| 1073741824 |      512 |           0 |         32 |   10.684 ms |   10.334 ms |   10.680 ms |   10.330 ms |   -3.28% | FAST   |
| 1073741824 |     1024 |           0 |         32 |   12.863 ms |   12.326 ms |   12.859 ms |   12.322 ms |   -4.18% | FAST   |
| 1073741824 |      256 |        1000 |         32 |    9.124 ms |    8.999 ms |    9.120 ms |    8.995 ms |   -1.37% | SAME   |
| 1073741824 |      512 |        1000 |         32 |   10.574 ms |   10.134 ms |   10.570 ms |   10.130 ms |   -4.16% | FAST   |
| 1073741824 |     1024 |        1000 |         32 |   12.721 ms |   12.255 ms |   12.717 ms |   12.251 ms |   -3.66% | FAST   |

@davidwendt
Copy link
Contributor

/ok to test

@davidwendt
Copy link
Contributor

/ok to test bc10cdb

@rapidsai rapidsai deleted a comment from copy-pr-bot bot Nov 5, 2025
@davidwendt
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 30baa59 into rapidsai:main Nov 5, 2025
143 checks passed
@mhaseeb123 mhaseeb123 deleted the fix/oob-access-hybrid-scan-dicts branch November 5, 2025 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Projects

Status: Burndown

Development

Successfully merging this pull request may close these issues.

4 participants