Skip to content

Entity creation rejected with unrelated transaction error #1123

Closed
@pingtimeout

Description

@pingtimeout

Describe the bug

Under a low concurrency test, entity updates are rejected with an HTTP 500 error. This applies for catalog updates, namespace creation, table updates, ...

To Reproduce

  • Check out this commit: pingtimeout@fbd3b90
  • Run the server using the getting-started docker compose file: docker compose -f getting-started/eclipselink/docker-compose.yml up
  • Export the client ID and secrets as environment variables: export CLIENT_ID=root CLIENT_SECRET=s3cr3t
  • Run ./gradlew :polaris-benchmarks:gatlingRun

The simulation will run 5 concurrent users. Each user creates its own catalog (named C_0, C_1, ...). Then, each user sequentially creates 5 namespaces under its own catalog (named NS_0, NS_1, ...).

Actual Behavior

The Gatling output consistently shows that not all catalogs nor all namespaces could be created. In the output below, only 1 catalog was created and the other 4 creations were rejected with an HTTP 500 error.


========================================================================================================================
2025-03-05 16:42:50 UTC                                                                               0s elapsed
---- Requests -----------------------------------------------------------------------|---Total---|-----OK----|----KO----
> Global                                                                             |        35 |        11 |        24
> Authenticate                                                                       |         5 |         5 |         0
> Create Catalog                                                                     |         5 |         1 |         4
> Create Namespace                                                                   |        25 |         5 |        20
---- Errors ------------------------------------------------------------------------------------------------------------
> status.find.is(200), but actually found 404                                                                20 (83.33%)
> status.find.is(201), but actually found 500                                                                 4 (16.67%)

This file is the server log for the Polaris instance. It contains numerous errors like the one below

2025-03-05 16:35:20 INFO  [org.apache.polaris.service.exception.IcebergExceptionMapper] (executor-thread-1) Handling runtimeException Exception [EclipseLink-4002] (Eclipse Persistence Services - 4.0.5.v202412231137-a96b873527f305f932543045c8679bb1de8d3a43): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: org.postgresql.util.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions
  Detail: Reason code: Canceled on identification as a pivot, during conflict out checking.
  Hint: The transaction might succeed if retried.
Error Code: 0
Call: UPDATE ENTITIES SET GRANTRECORDSVERSION = ?, VERSION = ? WHERE (((CATALOGID = ?) AND (ID = ?)) AND (VERSION = ?))
        bind => [5 parameters bound]
Query: UpdateObjectQuery(org.apache.polaris.jpa.models.ModelEntity@2da9cea8)

Those errors are not caught and result in a HTTP 500 response to be sent to the client. Here is the payload that is received on the Gatling side:

{"error":{"message":"Exception [EclipseLink-4002] (Eclipse Persistence Services - 4.0.5.v202412231137-a96b873527f305f932543045c8679bb1de8d3a43): org.eclipse.persistence.exceptions.DatabaseException\nInternal Exception: org.postgresql.util.
PSQLException: ERROR: could not serialize access due to concurrent update\nError Code: 0\nCall: UPDATE ENTITIES SET GRANTRECORDSVERSION = ?, VERSION = ? WHERE (((CATALOGID = ?) AND (ID = ?)) AND (VERSION = ?))\n\tbind => [5 parameters boun
d]\nQuery: UpdateObjectQuery(org.apache.polaris.jpa.models.ModelEntity@1123186d)","type":"PersistenceException","code":500}}

Expected Behavior

Given that there is no overlap between catalogs and namespaces, all queries should succeed.

Additional context

This result was reproduced even after #1092 has been merged on main.

System information

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions