Skip to content

fix: Prevent join panic when suffix="" and coalesce=True#27376

Open
Kevin-Patyk wants to merge 6 commits intopola-rs:mainfrom
Kevin-Patyk:fix/join_panic
Open

fix: Prevent join panic when suffix="" and coalesce=True#27376
Kevin-Patyk wants to merge 6 commits intopola-rs:mainfrom
Kevin-Patyk:fix/join_panic

Conversation

@Kevin-Patyk
Copy link
Copy Markdown
Contributor

@Kevin-Patyk Kevin-Patyk commented Apr 21, 2026

Should resolve #27368.

The issue was that, in _coalesce_full_join, .schema() was being called on df. In the scenario outlined in the issue, since suffix was an empty String, _finish_join would produce an output DataFrame with 2 identical columns as the result. This output frame was fed into _coalesce_full_join, which would then call .schema() on it and result in a duplicate column panic.

The fix (I believe) was matching keys to their positions in df using indices rather than relying on the schema.

🤖 Claude Sonnet 4.6 for navigating existing code, rubber ducking, and helping with syntax.

Open to suggestions if there are any for this particular fix. Existing Python tests and the added regression test pass, but maybe there is something I am missing. Thanks!

@github-actions github-actions Bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Apr 21, 2026
@Kevin-Patyk Kevin-Patyk marked this pull request as ready for review April 21, 2026 14:48
@Kevin-Patyk Kevin-Patyk marked this pull request as draft April 21, 2026 16:02
@Kevin-Patyk
Copy link
Copy Markdown
Contributor Author

Kevin-Patyk commented Apr 21, 2026

@kdn36 @nameexhaustion

The current issue I'm experiencing is that the test I wrote fails in debug mode (make test) since we are now allowing duplicate names through temporarily before coalescing. hstack_mut_unchecked in _finish_join raises an error when there are duplicate column names, which now do exist since the provided suffix is "" is nothing:

The application panicked (crashed).
Message:  called `Result::unwrap()` on an `Err` value: Duplicate(ErrString("column with name 'a' has more than one occurrence"))
Location: crates/polars-core/src/frame/horizontal.rs:33

The error happens in crates/polars-core/src/frame/horizontal.rs, which is where hstack_mut_unchecked lives.

If I run in release mode with my fix (using make build-release) it works fine:

import polars as pl
print(pl.__version__)
df1 = pl.DataFrame({'a': [0, 1], 'b': [10, 11]})
df2 = pl.DataFrame({'a': [1, 2], 'c': [11, 12]})
result = df1.join(df2, how='full', on='a', coalesce=True, suffix='')
print(result)

1.40.0
shape: (3, 3)
┌─────┬──────┬──────┐
│ abc    │
│ ---------  │
│ i64i64i64  │
╞═════╪══════╪══════╡
│ 11111   │
│ 2null12   │
│ 010null │
└─────┴──────┴──────┘

I wasn't sure how to go about dealing with it since it could affect some underlying logic. Thanks for your help!

@Kevin-Patyk Kevin-Patyk marked this pull request as ready for review April 21, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Bug fix python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Full Join Panic: coalesce=True, suffix=""

1 participant