Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Exception Handling in AWS Glue renameTable Method #11165

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jishangarg
Copy link

@jishangarg jishangarg commented Sep 18, 2024

Related to issue: #11155

This PR addresses an issue in the AWS Glue renameTable method where an incorrect exception is thrown when a table with the new name already exists. According to the Iceberg Catalog interface, the method should throw org.apache.iceberg.exceptions.AlreadyExistsException. However, it currently throws software.amazon.awssdk.services.glue.model.AlreadyExistsException, leading to inconsistent behavior.

Problem:
The renameTable method throws an AWS Glue-specific AlreadyExistsException instead of the expected Iceberg-specific AlreadyExistsException. This violates the contract defined by the Iceberg Catalog interface and can cause confusion for applications relying on the Iceberg exception handling.

Solution:
Updated the glue.createTable() call: The AlreadyExistsException from AWS Glue is now caught and rethrown as Iceberg’s org.apache.iceberg.exceptions.AlreadyExistsException. Ensures that the Iceberg exception is consistently used in line with the catalog interface specification.

Changes:
Wrapped the glue.createTable() call in a try-catch block. Caught software.amazon.awssdk.services.glue.model.AlreadyExistsException and rethrew it as org.apache.iceberg.exceptions.AlreadyExistsException with an appropriate error message. Retained the existing rollback logic for handling failures when dropping the old table.

Impact:
Ensures consistent exception handling in the renameTable method. Aligns the behavior with the Iceberg API contract. No impact on backward compatibility, as this only corrects exception handling. This PR ensures that the renameTable operation behaves as expected according to the Iceberg documentation and resolves the mismatch in exception handling.

This PR addresses an issue in the AWS Glue renameTable method where an incorrect exception is thrown when a table with the new name already exists. According to the Iceberg Catalog interface, the method should throw org.apache.iceberg.exceptions.AlreadyExistsException. However, it currently throws software.amazon.awssdk.services.glue.model.AlreadyExistsException, leading to inconsistent behavior.

Problem:
The renameTable method throws an AWS Glue-specific AlreadyExistsException instead of the expected Iceberg-specific AlreadyExistsException.
This violates the contract defined by the Iceberg Catalog interface and can cause confusion for applications relying on the Iceberg exception handling.

Solution:
Updated the glue.createTable() call: The AlreadyExistsException from AWS Glue is now caught and rethrown as Iceberg’s org.apache.iceberg.exceptions.AlreadyExistsException.
Ensures that the Iceberg exception is consistently used in line with the catalog interface specification.

Changes:
Wrapped the glue.createTable() call in a try-catch block.
Caught software.amazon.awssdk.services.glue.model.AlreadyExistsException and rethrew it as org.apache.iceberg.exceptions.AlreadyExistsException with an appropriate error message.
Retained the existing rollback logic for handling failures when dropping the old table.

Impact:
Ensures consistent exception handling in the renameTable method.
Aligns the behavior with the Iceberg API contract.
No impact on backward compatibility, as this only corrects exception handling.
This PR ensures that the renameTable operation behaves as expected according to the Iceberg documentation and resolves the mismatch in exception handling.
@jishangarg
Copy link
Author

Can somebody please review my PR ?

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @jishangarg, overall the fix looks good to me but I think we need a test for this. Could we add one to TestGlueCatalog please?

LOG.info("created rename destination table {}", to);
} catch (software.amazon.awssdk.services.glue.model.AlreadyExistsException e) {
// Catch AWS Glue exception and rethrow it as Iceberg's AlreadyExistsException
throw new org.apache.iceberg.exceptions.AlreadyExistsException("Table already exists: " + toTableName, e);
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Sep 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the fully qualified software.amazon.awssdk.services.glue.model.AlreadyExistsException for glue and then just use the normal class name for the Iceberg AlreadyExistsException. That aligns with what's done in the other parts of the code and is a bit more readable imo.

@amogh-jahagirdar
Copy link
Contributor

Also double check spotless, seems like those checks are failing. You can run ./gradlew :iceberg-aws:spotlessApply.
Also cc @geruh @singhpk234 @rahil-c @jackye1995 in case they wanted to take a look

@singhpk234
Copy link
Contributor

The particular case referred here makes sense but was thinking if this is something we need to handle broadly ?
IMHO we are lacking the exception handling entirely here for ex:
1/ What if it's 403 forbidden ? let;s say we have read only permission
2/ What if it;s service unavailable or rate limit exceeded etc

@amogh-jahagirdar was thinking if we can re-use / generalize this code or classification [1] and wire it to all all glue-clients request / response interaction, your thoughts ?

https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/glue/GlueTableOperations.java#L353

@amogh-jahagirdar
Copy link
Contributor

amogh-jahagirdar commented Sep 22, 2024

Thanks @singhpk234!

At the moment the catalog API for renameTable does not specify what the expected exceptions are for the 403 or throttling, so at the moment the Glue exception being surfaced for those cases doesn't really violate the interface but I agree in principle that we may as well make those Iceberg exceptions in the catalog implementation.

The part to consider here is downstream clients who may already be relying on the previous exception behavior and specifically handling Glue exceptions.

The already exists case is a bit more clear cut imo since the API defines the expected behavior for implementations and right now the impl is violating that.

Overall though, I wouldn't be opposed to upleveling the glue to iceberg exception conversion logic in GlueTableOperations to GlueToToIcebergConverter and then using that logic where relevant. It's more so once a certain exception behavior is in place, the risk of clients relying on that behavior increases over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants