-
Notifications
You must be signed in to change notification settings - Fork 369
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimised Schema Change for Cloud Datalake Sinks (#200)
This PR removes the properties for governing schema change rollover, as they can lead to errors and do not make sense in the current context. Additionally, the schema change detection mechanism has been updated to better handle compatibility. **Changes** ### Removed Properties for Schema Change Rollover: The properties `$connectorPrefix.schema.change.rollover` have been removed for the following connectors: connect.s3, connect.datalake, and connect.gcpstorage. This change eliminates potential errors and simplifies the schema change handling logic. ### New Property for Schema Change Detection: The property $connectorPrefix.schema.change.detector has been introduced for the following connectors: connect.s3, connect.datalake, and connect.gcpstorage. This property allows users to configure the schema change detection behavior with the following possible values: • `default`: Schemas are compared using object equality. • `version`: Schemas are compared by their version field. • `compatibility`: A more advanced mode that ensures schemas are compatible using Avro compatibility features. ### Updated Schema Change Detection Mechanism: The `SchemaChangeDetector` trait and its implementations have been introduced to improve schema compatibility handling. The `VersionSchemaChangeDetector`uses a version comparison instead of the previous method, which used only direct class equality comparison on the schema class. The `DefaultSchemaChangeDetector` uses object equality for schema comparison, providing a straightforward approach to detecting changes. The `CompatibilitySchemaChangeDetector` leverages Avro compatibility features to ensure schemas remain compatible, offering an advanced and robust mechanism for managing schema changes. Commits: * Avro Parquet Schema Rollover / No Rollover * Initial parquet rollover changes * Removing the schema rollover setting from configuration, amending how schema change is calculated for writers * Test fixes * Revert changes to CloudSinkTask * Compatibility schema change detector * Default schema change detector * Add schema change detector connector property * Fixes from review
- Loading branch information
1 parent
54d1db5
commit 4d42d05
Showing
30 changed files
with
747 additions
and
232 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.