You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Most functions call _s3_call which calls set_session. The function goal is to "connect" to aws.
In the context of multiple jobs (or batch_size > 1), each jobs will race to create the session despite the check if the session was already created.
Using process authentification, assume role and source_profile in the aws config, loading the credentials will fail with InfiniteLoopConfigError which can be replicated with the following example:
importbotocore.sessionimportbotocore.credentials# this part will be done once by s3fssession=botocore.session.Session(profile="profile")
resolver=botocore.credentials.create_credential_resolver(session, region_name=None)
# this part might be runned multiple times due to a potential race conditioncredentials=resolver.load_credentials() # <--- workscredentials=resolver.load_credentials() # <--- fails
where profile depends on an other profile.
One way to fix this is to use asyncio.Lock when creating the session to ensure that only one job will create the session.
I'm open to contribute if this is something we want to fix.
The text was updated successfully, but these errors were encountered:
Since you are talking about "jobs", do you mean you have threads running?
Agree, an syncio lock would be reasonable. The only await is at the call for s3creator, but load_credentials() happens earlier. I wonder if there's an asyncio clean way to say: "I have to await this, but I don't really want to yield control". Otherwise, we need to pass this new Lock object around.
On my experiment it was only multiple asyncio futures that was running (using the batch size argument of the async file system of fsspec). I have added a asyncio.Lock and it seemed to fixed the problem. That's why I opened the issue.
Most functions call
_s3_call
which callsset_session
. The function goal is to "connect" to aws.In the context of multiple jobs (or batch_size > 1), each jobs will race to create the session despite the check if the session was already created.
Using process authentification, assume role and source_profile in the aws config, loading the credentials will fail with InfiniteLoopConfigError which can be replicated with the following example:
where profile depends on an other profile.
One way to fix this is to use
asyncio.Lock
when creating the session to ensure that only one job will create the session.I'm open to contribute if this is something we want to fix.
The text was updated successfully, but these errors were encountered: