Skip to content

Conversation

kosiew
Copy link
Contributor

@kosiew kosiew commented Oct 2, 2025

Which issue does this PR close?


Rationale for this change

This PR extends DataFusion’s physical expression infrastructure to better handle column casts during projection equivalence, ordering propagation, and interval analysis.

Previously, only Column and CastExpr were normalized or considered in equivalence groups. This caused mismatches when input fields and projection expressions diverged, especially in cases involving CastColumnExpr. By teaching the system to normalize and propagate CastColumnExpr, we ensure:

  • Input field consistency in ProjectionMapping
  • Correct ordering equivalence detection when casts are widening
  • Interval utilities that properly recurse through CastColumnExpr nodes

These changes improve correctness in downstream optimizations without altering external APIs.


What changes are included in this PR?

  • Projection Mapping (projection.rs)

    • Normalize CastColumnExpr input fields to align with the input_schema
    • Transform and replace mismatched CastColumnExpr with corrected ones
  • Equivalence Properties (properties/mod.rs)

    • Extend sort equivalence rules to include CastColumnExpr when widening casts preserve ordering
  • Intervals Utilities (intervals/utils.rs)

    • Add recursive support for CastColumnExpr in check_support
  • Cast Column Expression Enhancements (expressions/cast_column.rs)

    • Add cast_options accessor
    • Add is_widening_cast helper to classify casts that preserve ordering
  • Tests

    • New unit tests covering normalization of mismatched CastColumnExpr input fields
    • Extended ordering propagation test cases with CastColumnExpr
    • Interval utility tests confirming support for CastColumnExpr

Are these changes tested?

✅ Yes.

  • Projection mapping normalization is verified by projection_mapping_normalizes_cast_column_input_field.
  • Ordering equivalence propagation is covered by new test cases (test case 6).
  • Interval utility recursion is tested with supports_cast_column_expr.

Are there any user-facing changes?

  • No breaking API changes.
  • Internal improvements only; external behavior remains consistent.
  • Cast handling is now more robust and correct when projections involve CastColumnExpr.

@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Oct 2, 2025
@kosiew kosiew force-pushed the util-cast-column-expr-17761 branch from d90282d to 7c9d879 Compare October 2, 2025 10:37
@kosiew kosiew marked this pull request as ready for review October 2, 2025 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Changes to the physical-expr crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Teach physical-expression utilities about CastColumnExpr
1 participant