BinaryType Column Not Supported when `use_parquet_in_write` Argument Set

I am using Pyspark with Spark 3.4.1
And the 3.1.1 version of the Snowflake connector.

I am trying to write a DataFrame with a BinaryType Column from Spark to Snowflake and keep running into the following error:

```
ERROR StageWriter$: Error occurred while loading files to Snowflake: java.sql.SQLException: Status of query associated with resultSet is FAILED_WITH_ERROR. SQL compilation error:
Expression type does not match column data type, expecting BINARY(8388608) but got VARIANT for column KEY_HASH Results not generated.
```

This happens whether I use Append / Overwrite, or `usestagingtable = true / false`. I also tried to use the `BINARY_INPUT_FORMAT = "BASE64"` Session parameter.

I believe I see the issue, The COPY INTO queries that it's writing don't perform any to_binary conversion:
```
copy into "AZURE_SQL_ADS_RAW"."DBO"."_RAW_TBL_NAME_staging_95236350"
( "MIGRATION_ID", "MIGRATION_JOB_RUN_ID", "DATA", "KEY_HASH", "LAST_MODIFIED_DATETIME", "METADATA" )
from (
    select $1:"_MIGRATION_ID_",
        $1:"_MIGRATION_JOB_RUN_ID_",
        $1:"_DATA_",
        $1:"_KEY_HASH_",
        $1:"_LAST_MODIFIED_DATETIME_",
        $1:"_METADATA_"
     FROM @spark_connector_load_stage_buhsX4u9oa/m6q9yLC6Do/ tmp
) 
FILE_FORMAT = (
    TYPE=PARQUET
    USE_VECTORIZED_SCANNER=TRUE
  )
```

In the above, `KEY_HASH` is a BinaryType field.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BinaryType Column Not Supported when `use_parquet_in_write` Argument Set #602

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BinaryType Column Not Supported when use_parquet_in_write Argument Set #602

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

BinaryType Column Not Supported when `use_parquet_in_write` Argument Set #602