Skip to content

Conversation

@benbellick
Copy link
Member

@benbellick benbellick commented Nov 26, 2025

This is build on top of #613, so I'll only open it up for proper review once that is merged in.

Closes #612

@benbellick
Copy link
Member Author

  • Substrait allows two encodings for UDT literals: Any (opaque bytes) and struct (typed fields). Calcite has no native UDT literal, so we need Calcite-side carriers that preserve both the payload and the UDT type.

  • Existing pattern for Any: we already used REINTERPRET over a binary literal to shuttle UserDefinedAnyLiteral payloads. We kept that because it’s minimal and reuses Calcite’s reinterpret machinery without new operators.

  • Struct-encoded UDTs: Calcite’s natural literal for structured payloads is ROW(...). To keep the UDT type, we wrap the ROW in a REINTERPRET cast to the UDT type, mirroring the Any path: payload literal + reinterpret to UDT.

  • Reverse path (Calcite → Substrait): peel the REINTERPRET; if the operand is binary, rebuild UserDefinedAnyLiteral; if it’s ROW, rebuild UserDefinedStructLiteral. A general ROW converter was added so struct literals can be reconstructed for reinterpret.

  • Why this shape: it gives a uniform pattern—outer type is always the UDT via reinterpret; inner literal is the chosen encoding (binary or ROW). It keeps Calcite changes small, leverages existing operators, and aligns with Substrait’s dual-encoding spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle struct-based UDT literals in isthmus

2 participants