Skip to content

Commit bb5cb98

Browse files
authored
Fix OOB memory access in JSON reader ingest_raw utility (#20451)
Fixes out-of-bounds memory reads in the JSON reader `ingest_raw` internal utility. Error found in the nightly memcheck and can be reproduced using: ``` compute-sanitizer --tool memcheck gtests/JSON_TEST --gtest_filter=JsonReaderTest/JsonReaderTest.ByteRange_MultiSource/0 --rmm_mode=cuda ``` Error reported: ``` ========= COMPUTE-SANITIZER Note: Google Test filter = JsonReaderTest/JsonReaderTest.ByteRange_MultiSource/0 [==========] Running 1 test from 1 test suite. [----------] Global test environment set-up. [----------] 1 test from JsonReaderTest/JsonReaderTest [ RUN ] JsonReaderTest/JsonReaderTest.ByteRange_MultiSource/0 ========= Program hit cudaErrorInvalidValue (error 1) due to "invalid argument" on CUDA API call to cudaMemcpyAsync_ptsz. ========= Saved host backtrace up to driver entry point at error ========= Host Frame: cudf::io::json::detail::ingest_raw_input(cudf::device_span<char, 18446744073709551615ul>, cudf::host_span<std::unique_ptr<cudf::io::datasource, std::default_delete<cudf::io::datasource> >, 18446744073709551615ul>, unsigned long, unsigned long, char, rmm::cuda_stream_view) [0xd51b19] in libcudf.so ========= Host Frame: std::vector<cudf::io::table_with_metadata, std::allocator<cudf::io::table_with_metadata> > split_byte_range_reading<int>(cudf::host_span<std::unique_ptr<cudf::io::datasource, std::default_delete<cudf::io::datasource> >, 18446744073709551615ul>, cudf::host_span<std::unique_ptr<cudf::io::datasource, std::default_delete<cudf::io::datasource> >, 18446744073709551615ul>, cudf::io::json_reader_options const&, cudf::io::json_reader_options const&, int, rmm::cuda_stream_view, rmm::detail::cccl_async_resource_ref<cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible> >) [0x1bd7a1] in JSON_TEST ... ``` The `cudaMemcpyAsync` call was writing past the end of the allocated device memory buffer. Once this was fixed, another error appears reading past the end of the buffer returned by `ingest_raw`. This is fixed by clamping the returned size to at most the size of the device buffer itself. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Shruti Shivakumar (https://github.com/shrshi) URL: #20451
1 parent fe241e5 commit bb5cb98

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

cpp/tests/io/json/json_utils.cuh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ std::vector<cudf::io::table_with_metadata> split_byte_range_reading(
3333
auto total_source_size = [&sources]() {
3434
return std::accumulate(sources.begin(), sources.end(), 0ul, [=](size_t sum, auto& source) {
3535
auto const size = source->size();
36-
return sum + size;
36+
return sum + size + 1;
3737
});
3838
}();
3939
auto find_first_delimiter_in_chunk =

0 commit comments

Comments
 (0)