- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Currently cannot flatten a List (or FixedSizeList) that contains a LargeList as it's inner element.
We should be able to support this (at least for queries we expect to succeed). For example, expect something like this to succeed in array.slt:
query ???
select flatten(arrow_cast(make_array([1], [2, 3], [null], make_array(4, null, 5)), 'FixedSizeList(4, LargeList(Int64))')),
       flatten(arrow_cast(make_array([[1.1], [2.2]], [[3.3], [4.4]]), 'List(LargeList(FixedSizeList(1, Float64)))'));
----
[1, 2, 1, 3, 2] [1, 2, 3, NULL, 4, NULL, 5] [[1.1], [2.2], [3.3], [4.4]]Currently it fails with:
1. query failed: DataFusion error: Execution error: flatten does not support type 'List(Field { name: "item", data_type: LargeList(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} })'
[SQL] select flatten(arrow_cast(make_array([1], [2, 3], [null], make_array(4, null, 5)), 'FixedSizeList(4, LargeList(Int64))')),
       flatten(arrow_cast(make_array([[1.1], [2.2]], [[3.3], [4.4]]), 'List(LargeList(FixedSizeList(1, Float64)))'));
at /Users/jeffrey/Code/datafusion/datafusion/sqllogictest/test_files/array.slt:7681Describe the solution you'd like
Need to consider return type, see how LargeList is missing here for the inner field match:
datafusion/datafusion/functions-nested/src/flatten.rs
Lines 107 to 110 in 35c1cfd
| List(field) | FixedSizeList(field, _) => match field.data_type() { | |
| List(field) | FixedSizeList(field, _) => List(Arc::clone(field)), | |
| _ => arg_types[0].clone(), | |
| }, | 
This is where current error is happening:
datafusion/datafusion/functions-nested/src/flatten.rs
Lines 166 to 168 in 35c1cfd
| LargeList(_) => { | |
| exec_err!("flatten does not support type '{:?}'", array.data_type())? | |
| } | 
- We just throw error without trying to see if it's possible
Perhaps we can try some sort of "best effort" where we try to downcast the LargeList child to a List and if that succeeds (i.e. all offsets of LargeList can fit inside a List) we can flatten it to the parent List, otherwise error; alternatively just upcast the parent List to a LargeList, though this might be tricky considering return_type() wouldn't know this until execution and I don't think we want to blindly upcast all parent Lists to LargeList.
Open to any other suggestions.
Describe alternatives you've considered
No response
Additional context
No response