Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CREATE OR REPLACE TABLE on Iceberg never succeeds if metadata location was changed concurrently #23477

Open
ivanvy-wix opened this issue Sep 18, 2024 · 0 comments
Labels
iceberg Iceberg connector

Comments

@ivanvy-wix
Copy link

I’m not sure how to write a test with concurrent queries, but to reproduce manually:

  1. Create a table
    CREATE TABLE t AS
    SELECT 1 AS v
  2. Start a CREATE OR REPLACE TABLE query that takes some time to finish, I used a large-ish source table s which takes my test instance some time to scan through:
    CREATE OR REPLACE TABLE t AS
    SELECT count() FROM s WHERE some_column LIKE '%aa%'
  3. While it’s running, update the table t (it should start after and finish before the previous query):
    INSERT INTO t VALUES (2)
  4. Observe CREATE OR REPLACE TABLE fail with
    Metadata location [s3a://iceberg-bucket/schema.db/t/metadata/00002-3919e831-9f5b-43f9-b73c-4e21cd320b0d.metadata.json] is not same as table metadata location [s3a://iceberg-bucket/schema.db/t/metadata/00001-c651ded9-73e9-47da-986f-4237e8de34a9.metadata.json] for schema.t
    

I think it happens, because on every attempt to commit in BaseTransaction#commitReplaceTransaction(), the table’s metadata is getting refreshed and then, assuming it’s refreshed, Iceberg sets the base to current() before calling commit(base, current). But in AbstractIcebergTableOperations#refresh() Trino doesn’t reload metadata if the table is getting replaced. So every commit retry results in the same CommitFailedException which eventually exhausts max retries and fails the query.

@findinpath findinpath added the iceberg Iceberg connector label Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iceberg Iceberg connector
Development

No branches or pull requests

2 participants