-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
Description
Describe the bug
As part of testing integration with the parquet crate in #8133 I found that trying to write a VariantArray directly to parquet
panics
To Reproduce
// Use the VariantArrayBuilder to build a VariantArray
let mut builder = VariantArrayBuilder::new(3);
// row 1: {"name": "Alice"}
let mut variant_builder = builder.variant_builder();
variant_builder.new_object().with_field("name", "Alice").finish()?;
variant_builder.finish();
let array = builder.build();
// TODO support writing VariantArray directly
// at the moment it panics when trying to downcast to a struct array
let array: ArrayRef = Arc::new(array);
// create a RecordBatch with the VariantArray
let batch = RecordBatch::try_from_iter(vec![("data", array)])?;
// write the RecordBatch to a Parquet file
let file = std::fs::File::create("variant.parquet")?;
let mut writer = ArrowWriter::try_new(file, batch.schema(), None)?;
writer.write(&batch)?;
writer.close()?;
This results in this panic
struct array
thread 'main' panicked at arrow-array/src/cast.rs:904:30:
struct array
stack backtrace:
0: __rustc::rust_begin_unwind
at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/std/src/panicking.rs:697:5
1: core::panicking::panic_fmt
at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/core/src/panicking.rs:75:14
2: core::panicking::panic_display
at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/core/src/panicking.rs:268:5
3: core::option::expect_failed
at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/core/src/option.rs:2081:5
4: core::option::Option<T>::expect
at /Users/andrewlamb/.rustup/toolchains/1.89-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/option.rs:960:21
5: arrow_array::cast::AsArray::as_struct
at /Users/andrewlamb/Software/arrow-rs/arrow-array/src/cast.rs:904:30
6: parquet::arrow::arrow_writer::levels::LevelInfoBuilder::try_new
at ./src/arrow/arrow_writer/levels.rs:162:35
7: parquet::arrow::arrow_writer::levels::calculate_array_levels
at ./src/arrow/arrow_writer/levels.rs:55:23
8: parquet::arrow::arrow_writer::compute_leaves
at ./src/arrow/arrow_writer/mod.rs:625:18
9: parquet::arrow::arrow_writer::ArrowRowGroupWriter::write
at ./src/arrow/arrow_writer/mod.rs:839:25
10: parquet::arrow::arrow_writer::ArrowWriter<W>::write
Expected behavior
We should be able to write a VariantArray directly without such an error
Additional context