Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undesired temp file deletion before upload #1197

Open
morillog opened this issue Aug 23, 2024 · 1 comment
Open

Undesired temp file deletion before upload #1197

morillog opened this issue Aug 23, 2024 · 1 comment
Labels
component: s3 S3 integration related issue status: waiting-for-feedback Waiting for feedback from issuer

Comments

@morillog
Copy link

morillog commented Aug 23, 2024

Type: Bug

Component: S3

Describe the bug
Hello,

I'm encountering an issue with an application responsible for generating a large number of documents. The problem arises sporadically, where I receive the following exception: io.awspring.cloud.s3.UploadFailedException: Upload failed. File is stored in a temporary folder in the filesystem <TempFolderDir>TempFileS3OutputStream....

After extensive analysis, I discovered that the file is being deleted just before the payload is generated for uploading within the method io.awspring.cloud.s3.AbstractTempFileS3OutputStream.close(). This issue doesn’t occur consistently; it seems random. However, due to the high volume of files processed, this causes a significant number of failures.

Interestingly, this temporary file is not being deleted by the logic in this method. It appears something else, possibly the garbage collector, is responsible. The core issue seems to be that the close method is unlocking the file for deletion before the upload is complete. Specifically, in line 129 of the AbstractTempFileS3OutputStream class, the localOutputStream is closed. Before this line is executed, the temp file cannot be deleted, but afterward, it becomes possible.

I believe this might be a bug because the localOutputStream should ideally be closed later on, not at the beginning, to prevent this kind of situation. I would appreciate it if you could review this and provide your feedback.

Here’s a list of steps I’ve already taken to troubleshoot the issue:

  1. The OutputStream (as shown in the code snippet below) is being used within a try-with-resources block.
  2. I’ve ensured that the OutputStream is not being closed elsewhere in the code.
  3. I’ve confirmed that the close method is not being called from different threads.
  4. I’ve verified that the third-party library we use for PDF generation does not hold any references to this temp file and does not perform any cleanup operations on it.

I’m reaching out as I’ve exhausted all ideas and would greatly appreciate your assistance.

Thank you very much in advance.

// OutputStream generation logic in a service
public OutputStream getOutputStream(String resultingName, S3BucketConfiguration bucketConfiguration) throws S3IntegrationException {
        if (bucketConfiguration == null || !StringUtils.hasText(bucketConfiguration.getBucketName())) {
            throw new IllegalArgumentException("Invalid bucket configuration");
        }
        WritableResource writeableResource = (WritableResource) this.resourceLoader.getResource(location);
        try {
            return writeableResource.getOutputStream();
        } catch (IOException e) {
            throw new S3IntegrationException(
                    String.format("An exception occurred getting output stream from S3 bucket %s",
                            bucketConfiguration.getBucketName()), e);
        }
    }

// here how we're using the OutputStream 
try (OutputStream os = myservice .getCommunicationUploadOutputStream(s3Name)) {
    // Here the writing logic.
} catch (IOException e) {
    throw new SomeException("String was not written to file.");
}

Sample
Since this issue occurs randomly, I don’t have a specific example to provide. However, you can reproduce the scenario by setting a breakpoint at line 129 in the AbstractTempFileS3OutputStream class.

Here’s how you can observe the problem:

  1. Add a breakpoint at line 129.
  2. While the execution is paused, try to manually delete the temporary file from the temp folder. You'll notice that the file is locked for deletion because it's still in use.
  3. Then, execute line 129 and attempt to delete the file again. You’ll find that the file can now be deleted.

This demonstrates that if a cleanup process—such as garbage collection, a cron job, or a third-party library—runs after closing the
OutputStream and before generate the put request, the file could be deleted, leading to the error.

@maciejwalkowiak maciejwalkowiak added the component: s3 S3 integration related issue label Sep 18, 2024
@maciejwalkowiak
Copy link
Contributor

Thanks @morillog for detailed description. As far as I can tell, garbage collection does not delete temporary files - they can be deleted only explicitly on on JVM exit when deleteOnExit() is called on the file.

We cannot also close the output stream after the upload method is called, as the written content of the file is available only after the stream is closed.

Can you experiment with providing custom OutputStream implementation that creates file in a configured location instead of temporary files? Perhaps there is something running on your instance that deletes them in background?

@maciejwalkowiak maciejwalkowiak added the status: waiting-for-feedback Waiting for feedback from issuer label Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: s3 S3 integration related issue status: waiting-for-feedback Waiting for feedback from issuer
Projects
None yet
Development

No branches or pull requests

2 participants