Process superclass methods before subclass methods in semanal #18723

ilevkivskyi · 2025-02-22T21:35:42Z

See also discussion in #18674 for another situation when this causes problems (deferrals). In general this problem is probably quite rare, but it bugs me, so I decided to go ahead with a simple and explicit (even though a bit ugly) solution.

ilevkivskyi · 2025-02-22T21:39:16Z

Hm, for some reason tests didn't start, I will try closing and re-opening.

ilevkivskyi · 2025-02-23T00:07:35Z

Oh well, it looks like this PR causes mypyc compiled mypy to segfault when running some tests.

ilevkivskyi · 2025-02-23T00:59:28Z

And as I guessed the error happens in one of those Bogus things, more precisely in CPyDef_semanal___SemanticAnalyzer___qualified_name (in PyUnicode_Concat one of the args is NULL or something).

ilevkivskyi · 2025-02-23T18:34:02Z

Actually it is much more tricky than that, fullname etc. are no longer Bogus instead empty strings are used. Looking at gdb, it seems that two totally valid strings are passed to PyUnicode_Concat but it segfaults. Maybe something is wrong with refcounting? I will try to dig a bit more with Python debug build.

JukkaL · 2025-02-24T10:31:29Z

mypy/semanal_main.py

+        return -1
+    if right_info in left_info.mro:
+        return 1
+    return 0


This seems to change the order of processing targets, even if derived classes are always after base classes (i.e. current ordering is already fine). I suspect that this will break the current SCC ordering algorithm, which we probably rely on in a bunch of places, and it could explain why things are failing. I think we must mostly follow the SCC ordering or we will have a bunch of weird regressions and generally a bad time.

Here's one potential way to fix this so this only changes the order when necessary:

Create a linear list of targets, similar to what you currently have.

Collect a set of all TypeInfos in the targets (e.g. all active_type values).

Iterate over the targets, and keep track of which TypeInfo's we've processed by removing the TypeInfo set created in the previous step. If we encounter a TypeInfo which has some MRO item that is in the set of TypeInfos, move that to a separately list (deferreds) instead of processing now.

After having iterated all targets, iterate over the deferred items.

The above approach could possibly be made even better by processing deferred nodes immediately after all the MRO entries have been processed, instead of waiting for all targets to be processed.

This has the benefit of not changing the processing order if it's already correct, and if it's incorrect, only the impacted targets will get rescheduled. This also could be a bit faster, since we perform a linear scan instead of a sort.

@JukkaL

This seems to change the order of processing targets, even if derived classes are always after base classes (i.e. current ordering is already fine)

I don't think I am following. Can you give an example of when this happens? I actually did a diff on full target list for mypy self check (including stdlib), and it is tiny, only few things that actually matter were changed (like e.g. couple visitors in mypy.types vs mypy.type_visitor).

Even then, how order of processing of method bodies can be so important? (All the top levels, including ClassDefs, are already processed at this point).

Ah I think I misunderstood how the ordering works. So it's probably fine. Changing the ordering of methods "shouldn't" change much, but it's just a very scary change that could trigger some pre-existing bugs or limitations. But if this only changes ordering very slightly, it should be fine.

Can you also manually test this when you import torch and numpy? At least torch has a massive import cycle which should be a good test case.

JukkaL · 2025-02-24T17:03:04Z

Maybe something is wrong with refcounting? I will try to dig a bit more with Python debug build.

Using a debug build is a good idea. Reference counting has been quite stable for a long time, but it's possible that something is still misbehaving.

github-actions · 2025-02-24T20:12:43Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

ilevkivskyi · 2025-02-24T20:29:12Z

@JukkaL

Using a debug build is a good idea. Reference counting has been quite stable for a long time, but it's possible that something is still misbehaving.

It looks like something is wrong with unpacking of tuples. Replacing unpacking with indexing fixes the segfaults (see last commit). I still don't have any small repro, but looking at this comment

# Special-case multiple assignments like 'x, y = expr' to reduce refcount ops.

it seems to me this may be caused by #16022

ilevkivskyi · 2025-02-24T20:45:29Z

@JukkaL somewhat weird test to reproduce the segfault

[case testTupleUnpackingInCallback]
def f(x: tuple[str, int], y: tuple[str, int]) -> int:
    _, xi = x
    _, yi = y
    return 0

[file driver.py]
from native import f
from functools import cmp_to_key

xs = [("x" * i, i) for i in range(100)]
assert sorted(xs, key=cmp_to_key(f))[-1] == 99

ilevkivskyi · 2025-02-24T20:54:31Z

Another test case (a bit less sketchy) shows that the problem appears if one of the unpacking targets is unused in the function

[case testTupleUnpackingInCallback]
def f(x: tuple[str, int], y: tuple[str, int]) -> int:
    a, xi = x
    _, yi = y
    if a == "":
        return 0
    return 0

[file driver.py]
from native import f
from functools import cmp_to_key

xs = [("x" * i, i) for i in range(100)]
xs = sorted(xs, key=cmp_to_key(f))
print(xs[1])
print(xs[2])

ilevkivskyi · 2025-02-24T21:05:04Z

OK, sorry for spamming, last message until I (or you) fix this, finally a self-container repro for the segfault

[case testTupleUnpackingInCallback]
def f(x: tuple[str, int]) -> int:
    a, xi = x
    return 0

[file driver.py]
from native import f

xs = [("x" * i, i) for i in range(100)]
xs = [x for x in xs if f(x) == 0]
print(xs[1])
print(xs[2])

ilevkivskyi · 2025-02-25T11:00:57Z

@JukkaL unless I am missing some other edge case, I think #18732 should fix it.

Process superclass methods before subclass methods in semanal

24ea7eb

ilevkivskyi requested a review from JukkaL February 22, 2025 21:35

ilevkivskyi mentioned this pull request Feb 22, 2025

Do not blindly undefer on leaving fuction #18674

Open

ilevkivskyi closed this Feb 22, 2025

ilevkivskyi reopened this Feb 22, 2025

This comment has been minimized.

Sign in to view

JukkaL reviewed Feb 24, 2025

View reviewed changes

Work around mypyc bug

a068451

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process superclass methods before subclass methods in semanal #18723

Process superclass methods before subclass methods in semanal #18723

ilevkivskyi commented Feb 22, 2025

ilevkivskyi commented Feb 22, 2025

This comment has been minimized.

ilevkivskyi commented Feb 23, 2025

ilevkivskyi commented Feb 23, 2025

ilevkivskyi commented Feb 23, 2025

JukkaL Feb 24, 2025

ilevkivskyi Feb 24, 2025

JukkaL Feb 24, 2025

JukkaL commented Feb 24, 2025

github-actions bot commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 25, 2025

Process superclass methods before subclass methods in semanal #18723

Are you sure you want to change the base?

Process superclass methods before subclass methods in semanal #18723

Conversation

ilevkivskyi commented Feb 22, 2025

ilevkivskyi commented Feb 22, 2025

This comment has been minimized.

ilevkivskyi commented Feb 23, 2025

ilevkivskyi commented Feb 23, 2025

ilevkivskyi commented Feb 23, 2025

JukkaL Feb 24, 2025

Choose a reason for hiding this comment

ilevkivskyi Feb 24, 2025

Choose a reason for hiding this comment

JukkaL Feb 24, 2025

Choose a reason for hiding this comment

JukkaL commented Feb 24, 2025

github-actions bot commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 24, 2025

ilevkivskyi commented Feb 25, 2025