-
Notifications
You must be signed in to change notification settings - Fork 969
Description
Describe the feature
Context for this is that I am migrating some usages of the v1.x SDK TransferManager to the v2.x equivalent S3TransferManager using a non-CRT multipart client.
When MultipartS3AsyncClient.putObject() performs a multi-part upload this may result in up to 10,000 UploadPart requests immediately being sent via the underlying client. From what I can tell, each of these would immediately attempt to acquire a connection from the underlying (Netty) HTTP client, with some of these requests waiting (above configuration max connections, default 50) with a high likelihood of some failing due to a connection acquisition timeout (default 10s) of the form:
Caused by: java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.
Consider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.
Increasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout.
If the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests.
This is more likely as the size of the multipart upload grows, as there are more parts.
The available options to workaround this that I can see are:
- Use the CRT client (which provides a
maxConcurrencyconfiguration option) - Increase one of both of the connection acquisition timeout or max connections configuration options on the underlying client
I believe this wasn't an issue with the v1.x TransferManager as it used a bounded executor service (of size 10) by default that naturally limited the request rate.
The default connection acquisition timeout (10s) and max connections (50) here seem to make large multipart transfers likely to fail (depending on network conditions). Could MultipartS3AsyncClient be made a bit smarter about the number of concurrent multipart operations it attempts?
Use Case
We made a largely straightforward migration of the v1.x TransferManager to the v2.x equivalent, and immediately noticed failing upload requests, particularly for larger uploads (eg: reproduced with a 3.5GiB upload).
From the exception, one of the suggestions is If the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, however in this particular case the burst of traffic is a single transfer manager upload request that immediately results in a large number of part upload requests.
Proposed Solution
Possibly make the number of concurrent multipart operations configurable for MultipartS3AsyncClient.
Other Information
No response
Acknowledgements
- I may be able to implement this feature request
- This feature might incur a breaking change
AWS Java SDK version used
2.38.7
JDK version used
java 17
Operating System and version
MacOS 26.1