Skip to content

gh-116738: Make bisect module thread-safe #137021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

yoney
Copy link
Contributor

@yoney yoney commented Jul 22, 2025

Modify the insort_left() and insort_right() functions in the bisect module to be thread-{aware,safe} by adding a lock to the sequence argument. While these functions did not crash before, it is more intuitive for them to lock the sequence they are about to modify. These functions locate the correct position to insert an item and then perform the insertion. By adding a lock, the sequence remains safe from concurrent modifications by other threads during this process.

The new tests failed before introducing the lock.

cc: @mpage @Yhg1s @colesbury

@yoney yoney marked this pull request as ready for review July 22, 2025 20:03
@yoney yoney requested a review from rhettinger as a code owner July 22, 2025 20:03
@mpage mpage requested review from colesbury, Yhg1s and mpage July 22, 2025 20:09
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pure Python implementations in bsiect.py are still not going to be thread safe.

While these functions did not crash before, it is more intuitive for them to lock the sequence they are about to modify...

I'm not fully convinced this is worth it. I think it's okay for bisect not to be thread safe, but I don't feel very strongly about this one way or the other.

from test.support import import_helper, threading_helper, subTests
from test.support.threading_helper import run_concurrently

bisect = import_helper.import_module("bisect")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be import bisect?

If I understand correctly, import_helper.import_module() handles missing optional modules but that doesn't seem to be relevant here.

@yoney
Copy link
Contributor Author

yoney commented Jul 22, 2025

I'm not fully convinced this is worth it. I think it's okay for bisect not to be thread safe, but I don't feel very strongly about this one way or the other.

I don't feel strongly about landing it. If we agree current state is good enough, we can go ahead and mark the module as safe and close the PR.

@yoney
Copy link
Contributor Author

yoney commented Jul 22, 2025

The pure Python implementations in bsiect.py are still not going to be thread safe.

Yes, pure Python version would be different, which is a good reason to consider dropping this change.
Just to clarify my understanding; there is noway to lock the mutex on the object from Python, right?

@colesbury
Copy link
Contributor

Just to clarify my understanding; there is noway to lock the mutex on the object from Python, right?

No, we don't expose a way to lock the object. Critical sections are not very useful from Python because the eval breaker checks can run arbitrary code (such as via the GC) and that can temporarily suspend the active critical section. So things would only be "atomic most of the time", which is not a very useful property.

@ZeroIntensity
Copy link
Member

The pure Python implementations in bsiect.py are still not going to be thread safe.

I think this is an issue with several other modules already. For example, the pure-Python implementation of heapq is not thread-safe while the C implementation is.

@rhettinger rhettinger removed their request for review July 23, 2025 02:46
@rhettinger
Copy link
Contributor

I think this is an issue with several other modules already. For example, the pure-Python implementation of heapq is not thread-safe while the C implementation is.

ISTM that all we want in these cases is for the C implementation to not segfault. There is no need to promise more than the pure python version promises. If the heapq module was edited to do more, that should probably be reverted. The C versions are intended to just be accelerators.

@colesbury
Copy link
Contributor

I think with heapq the locks were necessary to make it not segfault under race conditions because it directly accessed list internals. I don't think that's the case here because we're using the abstract sequence API instead.

@kumaraditya303
Copy link
Contributor

While these functions did not crash before, it is more intuitive for them to lock the sequence they are about to modify.

I think instead of adding locks for this, we should document the thread safety of this module like as in #136555

@ZeroIntensity
Copy link
Member

To clarify, I'm not saying we should necessarily add locking to bisect, I just think we shouldn't reject the idea on the basis that it would create differences between the C and Python implementation, because we already do that in other modules.

There is no need to promise more than the pure python version promises. If the heapq module was edited to do more, that should probably be reverted. The C versions are intended to just be accelerators.

The problem is that by adding memory safety to things like heapq (i.e., making it not crash via locking), we're also incidentally adding thread safety. But, then the Python implementation remains memory safe without thread safety (i.e., doesn't crash but also doesn't really work).

@serhiy-storchaka
Copy link
Member

If we make the C implementation thread-safe, users will start writing code that depends on this and won't work with a pure Python implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants