-
Notifications
You must be signed in to change notification settings - Fork 106
Open
Description
I am using Pyspark with Spark 3.4.1
And the 3.1.1 version of the Snowflake connector.
I am trying to write a DataFrame with a BinaryType Column from Spark to Snowflake and keep running into the following error:
ERROR StageWriter$: Error occurred while loading files to Snowflake: java.sql.SQLException: Status of query associated with resultSet is FAILED_WITH_ERROR. SQL compilation error:
Expression type does not match column data type, expecting BINARY(8388608) but got VARIANT for column KEY_HASH Results not generated.
This happens whether I use Append / Overwrite, or usestagingtable = true / false. I also tried to use the BINARY_INPUT_FORMAT = "BASE64" Session parameter.
I believe I see the issue, The COPY INTO queries that it's writing don't perform any to_binary conversion:
copy into "AZURE_SQL_ADS_RAW"."DBO"."_RAW_TBL_NAME_staging_95236350"
( "MIGRATION_ID", "MIGRATION_JOB_RUN_ID", "DATA", "KEY_HASH", "LAST_MODIFIED_DATETIME", "METADATA" )
from (
select $1:"_MIGRATION_ID_",
$1:"_MIGRATION_JOB_RUN_ID_",
$1:"_DATA_",
$1:"_KEY_HASH_",
$1:"_LAST_MODIFIED_DATETIME_",
$1:"_METADATA_"
FROM @spark_connector_load_stage_buhsX4u9oa/m6q9yLC6Do/ tmp
)
FILE_FORMAT = (
TYPE=PARQUET
USE_VECTORIZED_SCANNER=TRUE
)
In the above, KEY_HASH is a BinaryType field.
Metadata
Metadata
Assignees
Labels
No labels