Skip to content

Commit

Permalink
[SPARK-49866][SQL] Improve the error message for describe table with …
Browse files Browse the repository at this point in the history
…partition columns

### What changes were proposed in this pull request?
Provide more user facing error when partition column name can't be found in the table schema.

### Why are the changes needed?
There's an issue where partition column sometimes doesn't match any from the table schema. When that happens we throw an assertion error which is not user friendly. Because of that we introduced new `QueryExecutionError` in order to make it more user facing.

### Does this PR introduce _any_ user-facing change?
Yes, users will get more user friendly error message.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #48338 from mihailoale-db/mihailoale-db/fixdescribepartitioningmessage.

Authored-by: Mihailo Aleksic <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
  • Loading branch information
mihailoale-db authored and MaxGekk committed Oct 5, 2024
1 parent 3e69b40 commit 37f2966
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 3 deletions.
6 changes: 6 additions & 0 deletions common/utils/src/main/resources/error/error-conditions.json
Original file line number Diff line number Diff line change
Expand Up @@ -3802,6 +3802,12 @@
],
"sqlState" : "428FT"
},
"PARTITION_COLUMN_NOT_FOUND_IN_SCHEMA" : {
"message" : [
"Partition column <column> not found in schema <schema>. Please provide the existing column for partitioning."
],
"sqlState" : "42000"
},
"PATH_ALREADY_EXISTS" : {
"message" : [
"Path <outputPath> already exists. Set mode as \"overwrite\" to overwrite the existing path."
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2856,4 +2856,16 @@ private[sql] object QueryExecutionErrors extends QueryErrorsBase with ExecutionE
)
)
}

def partitionColumnNotFoundInTheTableSchemaError(
column: Seq[String],
schema: StructType): SparkRuntimeException = {
new SparkRuntimeException(
errorClass = "PARTITION_COLUMN_NOT_FOUND_IN_SCHEMA",
messageParameters = Map(
"column" -> toSQLId(column),
"schema" -> toSQLType(schema)
)
)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import org.apache.spark.sql.catalyst.util.{quoteIfNeeded, ResolveDefaultColumns}
import org.apache.spark.sql.connector.catalog.{CatalogV2Util, SupportsMetadataColumns, SupportsRead, Table, TableCatalog}
import org.apache.spark.sql.connector.expressions.{ClusterByTransform, IdentityTransform}
import org.apache.spark.sql.connector.read.SupportsReportStatistics
import org.apache.spark.sql.errors.QueryExecutionErrors
import org.apache.spark.sql.util.CaseInsensitiveStringMap
import org.apache.spark.util.ArrayImplicits._

Expand Down Expand Up @@ -156,9 +157,12 @@ case class DescribeTableExec(
.map(_.asInstanceOf[IdentityTransform].ref.fieldNames())
.map { fieldNames =>
val nestedField = table.schema.findNestedField(fieldNames.toImmutableArraySeq)
assert(nestedField.isDefined,
s"Not found the partition column ${fieldNames.map(quoteIfNeeded).mkString(".")} " +
s"in the table schema ${table.schema().catalogString}.")
if (nestedField.isEmpty) {
throw QueryExecutionErrors.partitionColumnNotFoundInTheTableSchemaError(
fieldNames.toSeq,
table.schema()
)
}
nestedField.get
}.map { case (path, field) =>
toCatalystRow(
Expand Down

0 comments on commit 37f2966

Please sign in to comment.