Closed
Description
Describe the bug
The test python/cudf_polars/tests/test_join.py::test_non_coalesce_join[left-nulls_not_equal-join_expr0]
fails when using a small blocksize / multiple partitions.
Steps/Code to reproduce bug
Here's a simplified example
import polars as pl
from cudf_polars.testing.asserts import assert_gpu_result_equal
left = pl.LazyFrame(
{
"a": [1, 2, 3, 1, None],
"b": [1, 2, 3, 4, 5],
"c": [2, 3, 4, 5, 6],
}
)
right = pl.LazyFrame(
{
"a": [1, 4, 3, 7, None, None, 1],
"c": [2, 3, 4, 5, 6, 7, 8],
"d": [6, None, 7, 8, -1, 2, 4],
}
)
q = left.join(right, on=pl.col("a"), how="inner", nulls_equal=False, coalesce=False)
assert_gpu_result_equal(q, engine=pl.GPUEngine(executor="streaming", executor_options={"max_rows_per_partition": 3}))
which fails with
AssertionError: DataFrames are different (value mismatch for column 'a')
[left]: [1, 1, 3, 1, 1]
[right]: [1, 3, 1, 1, 1]
Expected behavior
Match polars / no error.
Metadata
Metadata
Assignees
Type
Projects
Status
Done