Skip to content

Implement error handling by putting the info into _Spark_'s standard error column (String)Β #86

@benedeki

Description

@benedeki

Background

One of the provided ErrorHandling implementations. Title is actually little misleading, point is to write the errors into string column, and the column name should default into spark.sql.columnNameOfCorruptRecord (See Runtime SQL Configuration)

Feature

Write errors into a StringType column, by converting each error submit filed into a string and concatenating them with a delimiter. The column name should/might default to spark.sql.columnNameOfCorruptRecord

Proposed Solution

Solution Ideas:

  • Configurable error delimiter (delimiter between different errors), default \n
  • Configurable error field delimiter, default ,
  • Enable quoting? Probably yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    πŸ“‹ Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions