Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-49352][SQL] Avoid redundant array transform for identical expr…
…ession ### What changes were proposed in this pull request? This patch avoids `ArrayTransform` in `resolveArrayType` function if the resolution expression is the same as input param. ### Why are the changes needed? Our customer encounters significant performance regression when migrating from Spark 3.2 to Spark 3.4 on a `Insert Into` query which is analyzed as a `AppendData` on an Iceberg table. We found that the root cause is in Spark 3.4, `TableOutputResolver` resolves the query with additional `ArrayTransform` on an `ArrayType` field. The `ArrayTransform`'s lambda function is actually an identical function, i.e., the transformation is redundant. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test and manual e2e test ### Was this patch authored or co-authored using generative AI tooling? No Closes #47843 from viirya/fix_redundant_array_transform. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
- Loading branch information