[SPARK-54776][SQL] Improved the logs message regarding lambda function with SQL UDF #53542
+68
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Changes made:
Added new error condition in error-conditions.json:
UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF - A clear error message for SQL UDFs used in lambda functions.
Why are the changes needed?
Currently, when a SQL UDF is used inside a higher-order function like transform, the error message is confusing:
Before (confusing error):
[MISSING_ATTRIBUTES.RESOLVED_ATTRIBUTE_MISSING_FROM_INPUT]
Resolved attribute(s) "x" missing from in operator !Project [cast(lambda x#20395 as string) AS s#20397].
SQLSTATE: XX000
This error doesn't explain why the attribute is missing or what the user should do.
After (clear error):
[UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF] The feature is not supported: Lambda function with SQL UDF "spark_catalog.default.lower_udf(lambda x)" in a higher order function. SQLSTATE: 0A000
This is consistent with the existing error message for Python UDFs in the same scenario (UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_PYTHON_UDF).
Does this PR introduce any user-facing change?
Yes. Users will now see a clearer, more actionable error message when attempting to use a SQL UDF inside a higher-order function's lambda expression.
How was this patch tested?
Test 1:
Added a new test case "SQL UDF in higher-order function should fail with clear error message" in SQLFunctionSuite.scala that:
Creates a SQL UDF
Attempts to use it in a transform higher-order function
Verifies the error condition is UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF
Verifies the error message contains the function name and lambda x
Test 2:
Manual testing
spark.sql("CREATE OR REPLACE FUNCTION test_lower_udf(s STRING) RETURNS STRING RETURN lower(s)") spark.sql("SELECT transform(array('A', 'B'), x -> test_lower_udf(x))").show()
Was this patch authored or co-authored using generative AI tooling?
No