[ty] Create a specialization from a constraint set #21414

dcreager · 2025-11-12T21:06:18Z

This patch lets us create specializations from a constraint set. The constraint encodes the restrictions on which types each typevar can specialize to. Given a generic context and a constraint set, we iterate through all of the generic context's typevars. For each typevar, we abstract the constraint set so that it only mentions the typevar in question (propagating derived facts if needed). We then find the "best representative type" for the typevar given the abstracted constraint set.

When considering the BDD structure of the abstracted constraint set, each path from the BDD root to the true terminal represents one way that the constraint set can be satisfied. (This is also one of the clauses in the DNF representation of the constraint set's boolean formula.) Each of those paths is the conjunction of the individual constraints of each internal node that we traverse as we walk that path, giving a single lower/upper bound for the path. We use the upper bound as the "best" (i.e. "closest to object") type for that path.

If there are multiple paths in the BDD, they technically represent independent possible specializations. If there's a single specialization that satisfies all of them, we will return that as the specialization. If not, then the constraint set is ambiguous. (This happens most often with constrained typevars.) We could in the future turn each of the paths into separate specializations, but it's not clear what we would do with that, so instead we just report the ambiguity as a specialization failure.

astral-sh-bot · 2025-11-12T21:08:41Z

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

astral-sh-bot · 2025-11-12T21:09:36Z

`mypy_primer` results

Changes were detected when running on open source projects

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
- src/scikit_build_core/build/wheel.py:98:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 44 diagnostics
+ Found 43 diagnostics

No memory usage changes detected ✅

) This saga began with a regression in how we handle constraint sets where a typevar is constrained by another typevar, which #21068 first added support for: ```py def mutually_constrained[T, U](): # If [T = U ∧ U ≤ int], then [T ≤ int] must be true as well. given_int = ConstraintSet.range(U, T, U) & ConstraintSet.range(Never, U, int) static_assert(given_int.implies_subtype_of(T, int)) ``` While working on #21414, I saw a regression in this test, which was strange, since that PR has nothing to do with this logic! The issue is that something in that PR made us instantiate the typevars `T` and `U` in a different order, giving them differently ordered salsa IDs. And importantly, we use these salsa IDs to define the variable ordering that is used in our constraint set BDDs. This showed that our "mutually constrained" logic only worked for one of the two possible orderings. (We can — and now do — test this in a brute-force way by copy/pasting the test with both typevar orderings.) The underlying bug was in our `ConstraintSet::simplify_and_domain` method. It would correctly detect `(U ≤ T ≤ U) ∧ (U ≤ int)`, because those two constraints affect different typevars, and from that, infer `T ≤ int`. But it wouldn't detect the equivalent pattern in `(T ≤ U ≤ T) ∧ (U ≤ int)`, since those constraints affect the same typevar. At first I tried adding that as yet more pattern-match logic in the ever-growing `simplify_and_domain` method. But doing so caused other tests to start failing. At that point, I realized that `simplify_and_domain` had gotten to the point where it was trying to do too much, and for conflicting consumers. It was first written as part of our display logic, where the goal is to remove redundant information from a BDD to make its string rendering simpler. But we also started using it to add "derived facts" to a BDD. A derived fact is a constraint that doesn't appear in the BDD directly, but which we can still infer to be true. Our failing test relies on derived facts — being able to infer that `T ≤ int` even though that particular constraint doesn't appear in the original BDD. Before, `simplify_and_domain` would trace through all of the constraints in a BDD, figure out the full set of derived facts, and _add those derived facts_ to the BDD structure. This is brittle, because those derived facts are not universally true! In our example, `T ≤ int` only holds along the BDD paths where both `T = U` and `U ≤ int`. Other paths will test the negations of those constraints, and on those, we _shouldn't_ infer `T ≤ int`. In theory it's possible (and we were trying) to use BDD operators to express that dependency...but that runs afoul of how we were simultaneously trying to _remove_ information to make our displays simpler. So, I ripped off the band-aid. `simplify_and_domain` is now _only_ used for display purposes. I have not touched it at all, except to remove some logic that is definitely not used by our `Display` impl. Otherwise, I did not want to touch that house of cards for now, since the display logic is not load-bearing for any type inference logic. For all non-display callers, we have a new **_sequent map_** data type, which tracks exactly the same derived information. But it does so (a) without trying to remove anything from the BDD, and (b) lazily, without updating the BDD structure. So the end result is that all of the tests (including the new regressions) pass, via a more efficient (and hopefully better structured/documented) implementation, at the cost of hanging onto a pile of display-related tech debt that we'll want to clean up at some point.

carljm

This is lovely!

My only questions fall into the category of "constrained typevars are weird" and can be punted to later.

carljm · 2025-11-18T20:54:52Z

crates/ty_python_semantic/resources/mdtest/generics/specialize_constrained.md

+class Unrelated: ...
+
+def constrained[T: (Base, Unrelated)]():
+    # revealed: ty_extensions.Specialization[T@constrained = Base | Unrelated]


Doesn't this result violate "must specialize to one of those specific types"?

I feel like we discussed this case earlier and looked at what other type checkers do. I think they rather infer Unknown if they have no basis for picking one of the constrained types? At least that's what pyright seems to do: https://pyright-play.net/?pyrightVersion=1.1.405&pythonVersion=3.12&reportUnreachable=true&code=CYUwZgBGDaAqBcEAUBLAdgFwDQQM4YCcBKAXSQA9FYIAfCAOQHs0QiIBaAPggQCgIBEFJHJDcENIwwNmIeP0GKCIDAFcCaCOQUCCAQxS4QPAJ4AHEAFECBRgSQAiAIIBjNXoA2HkzJZiIAO62aADmDkS8vMoAbiCeAPoY5iBIYEhMLEQRvEA

Ah yes, come to think of it, this is where we talked way back about how we might get more than one possible specialization. That seems like the better way to model this — multiple satisfiable paths in the BDD leads to an ambiguous result (and we can give you all of them if you want them), not to the union of those results.

Well I would distinguish between constrained typevars and others here. For bounded or unbounded typevars, I like the union type resulting from multiple ORed constraints! I don't think that should go to Unknown. But I do think that if we can't pick one of the constrained types for a constrained typevar, that should go to Unknown.

I think the unbounded example was incorrect, though. The constraint set is T ≤ int ∨ T ≤ str, but int | str is not a subtype of int nor of str, so T = int | str is not a valid specialization. With the change that I'm introducing to handle constrained typevars, I now get T = Never as the specialization, which I think is the only correct one.

(This is the unbounded example:)

ruff/crates/ty_python_semantic/resources/mdtest/generics/specialize_constrained.md

Lines 44 to 45 in 4327616

# revealed: ty_extensions.Specialization[T@unbounded = int | str]

reveal_type(generic_context(unbounded).specialize_constrained(ConstraintSet.range(Never, T, int) | ConstraintSet.range(Never, T, str)))

I think the unbounded example was incorrect, though. The constraint set is T ≤ int ∨ T ≤ str, but int | str is not a subtype of int nor of str, so T = int | str is not a valid specialization.

Oh yes, of course this is right.

With the change that I'm introducing to handle constrained typevars, I now get T = Never as the specialization, which I think is the only correct one.

Hmm, well it's clearly not the only correct solution, and definitely not the "closest to object one"? Aren't T = int and T = str both valid solutions for the constraint set T ≤ int ∨ T ≤ str? (It's an OR, not an AND.) Never is a valid solution but doesn't seem like a useful one.

But now I see why it might make sense to specialize to Unknown in this case, too. We have two equally-good specialization options and no reason to pick one over the other, and either one we pick could lead to false positives.

But now I see why it might make sense to specialize to Unknown in this case, too. We have two equally-good specialization options and no reason to pick one over the other, and either one we pick could lead to false positives.

This maybe brings up a different question, which is the difference between a useful solution and a valid one. As I've written it, we do return a specialization for the unbounded example — T = Never. (It's the only single specialization that satisfies all branches of the constraint set union.) That's different than the constrained typevar case, where we return None. So we wouldn't fall back on T = Unknown for the unbounded case.

This has come up in some of the other PRs, as well. Is T = Never ever a specialization we'd want to return? I've been arguing that if you don't want that as a specialization, you should be adding T ≠ Never as an additional constraint in your constraint set, and that the constraint set algorithms shouldn't pre-suppose that.

I'm honestly not sure on these questions without tying it back to real Python examples. It's not clear to me in what real-world scenarios we'd end up with a constraint set like T ≤ int ∨ T ≤ str. So maybe it's better to leave this for a future PR with ecosystem results to look at.

carljm · 2025-11-18T20:57:03Z

crates/ty_python_semantic/resources/mdtest/generics/specialize_constrained.md

+If any of the constraints is a gradual type, we are free to choose any materialization of that
+constraint that makes the test succeed.


Yes, but shouldn't we still solve to the gradual type that is one of the constraints? E.g. see https://pyright-play.net/?pyrightVersion=1.1.405&pythonVersion=3.12&reportUnreachable=true&code=GYJw9gtgBALgngBwJYDsDmUkQWEMoCCKcAUCQCYCmwUwA2gCoBcUAFKjADSHECUAuqwAeLBlF5QAtAD4ozElEVQQlGAFcQKKELIqAbpQCGAGwD68BJVbBWAImBgwt3rzJA

E.g. in every solution in the first section below that is not Base, I would expect to solve to Any, rather than any of the materializations of it. I suspect this will cause false positives and violate the gradual guarantee otherwise?

It seems like the current approach takes as a requirement that we always solve to a fully static type, but I'm not sure that is what we should do, in the case of constrained type variables.

It seems like the current approach takes as a requirement that we always solve to a fully static type, but I'm not sure that is what we should do, in the case of constrained type variables.

Hmm, that is a good point. I would like to put in TODOs and handle that as follow-on work.

carljm · 2025-11-18T21:02:22Z

crates/ty_python_semantic/resources/mdtest/generics/specialize_constrained.md

+    # revealed: ty_extensions.Specialization[T@mutually_bound = Base, U@mutually_bound = Base]
+    reveal_type(generic_context(mutually_bound).specialize_constrained(ConstraintSet.range(Never, U, T)))
+
+    # revealed: ty_extensions.Specialization[T@mutually_bound = Sub, U@mutually_bound = object]
+    reveal_type(generic_context(mutually_bound).specialize_constrained(ConstraintSet.range(Never, T, Sub)))
+    # revealed: ty_extensions.Specialization[T@mutually_bound = Sub, U@mutually_bound = Sub]
+    reveal_type(generic_context(mutually_bound).specialize_constrained(ConstraintSet.range(Never, T, Sub) & ConstraintSet.range(Never, U, T)))
+    # revealed: ty_extensions.Specialization[T@mutually_bound = Base, U@mutually_bound = Sub]
+    reveal_type(generic_context(mutually_bound).specialize_constrained(ConstraintSet.range(Never, U, Sub) & ConstraintSet.range(Never, U, T)))


We're seeing flaky test failures on macos, which seems to be caused by different Salsa ID orderings on the different platforms. Constraint set BDDs order their internal nodes based on the Salsa IDs of the interned typevar structs, and we had some code that depended on variable ordering in an unexpected way. This patch definitely fixes the macos test failure on #21414, and hopefully fixes it on #21436, too.

* origin/main: [ty] Fix flaky tests on macos (#21524) [ty] Add tests for generic implicit type aliases (#21522) [ty] Semantic tokens: consistently add the `DEFINITION` modifier (#21521) Only render hyperlinks for terminals known to support them (#21519) [ty] Keep colorizing `mypy_primer` output (#21515) [ty] Exit with `2` if there's any IO error (#21508) [`ruff`] Fix false positive for complex conversion specifiers in `logging-eager-conversion` (`RUF065`) (#21464) [ty] tighten up handling of subscripts in type expressions (#21503)

@sharkdp

…t sets (#21530) #21414 added the ability to create a specialization from a constraint set. It handled mutually constrained typevars just fine, e.g. given `T ≤ int ∧ U = T` we can infer `T = int, U = int`. But it didn't handle _nested_ constraints correctly, e.g. `T ≤ int ∧ U = list[T]`. Now we do! This requires doing a fixed-point "apply the specialization to itself" step to propagate the assignments of any nested typevars, and then a cycle detection check to make sure we don't have an infinite expansion in the specialization. This gets at an interesting nuance in our constraint set structure that @sharkdp has asked about before. Constraint sets are BDDs, and each internal node represents an _individual constraint_, of the form `lower ≤ T ≤ upper`. `lower` and `upper` are allowed to be other typevars, but only if they appear "later" in the arbitary ordering that we establish over typevars. The main purpose of this is to avoid infinite expansion for mutually constrained typevars. However, that restriction doesn't help us here, because only applies when `lower` and `upper` _are_ typevars, not when they _contain_ typevars. That distinction is important, since it means the restriction does not affect our expressiveness: we can always rewrite `Never ≤ T ≤ U` (a constraint on `T`) into `T ≤ U ≤ object` (a constraint on `U`). The same is not true of `Never ≤ T ≤ list[U]` — there is no "inverse" of `list` that we could apply to both sides to transform this into a constraint on a bare `U`.

dcreager added internal An internal refactor or improvement ty Multi-file analysis & type inference labels Nov 12, 2025

dcreager mentioned this pull request Nov 14, 2025

[ty] Better handling of "derived information" in constraint sets #21463

Merged

Base automatically changed from dcreager/coolable to main November 17, 2025 18:43

dcreager force-pushed the dcreager/specialize-from-constraints branch from 891e357 to c562946 Compare November 17, 2025 19:48

dcreager changed the base branch from main to dcreager/ordering-bug November 17, 2025 19:48

dcreager force-pushed the dcreager/specialize-from-constraints branch 5 times, most recently from 3237964 to a28f562 Compare November 17, 2025 22:22

dcreager marked this pull request as ready for review November 18, 2025 01:40

dcreager requested review from AlexWaygood, MichaReiser, carljm and sharkdp as code owners November 18, 2025 01:40

Base automatically changed from dcreager/ordering-bug to main November 18, 2025 17:02

MichaReiser requested a review from ibraheemdev November 18, 2025 17:33

dcreager added 9 commits November 18, 2025 13:35

retain_one

aa5ca1f

create spec from constraint set

922408f

add GenericContext mdtest class

a63cc39

add Specialization mdtest class

0e56470

add GenericContext.specialize_constrained mdtest method

a6a10d9

babby's first test

c2cbbc0

many more tests

e32bdfc

lint

27d8fd2

turn off some auto-formatting

7f31e4e

dcreager added 3 commits November 18, 2025 13:36

mutually bound

4214f35

don't format!

036c892

normalize for consistent output

1c9744d

dcreager force-pushed the dcreager/specialize-from-constraints branch from a28f562 to 1c9744d Compare November 18, 2025 18:36

sharkdp mentioned this pull request Nov 18, 2025

[ty] Keep colorizing mypy_primer output #21515

Merged

carljm approved these changes Nov 18, 2025

View reviewed changes

does this fix the macos thing?

4327616

This was referenced Nov 19, 2025

[ty] Fix flaky tests on macos #21524

Merged

[ty] Check method definitions on subclasses for Liskov violations #21436

Open

dcreager added 5 commits November 19, 2025 10:03

handle constrained typevars correctly

580d5a3

consider bounds when removing/retaining

cdf2f79

reword comment

05cfd3c

add todos for gradual constraints

7ec2fbe

dcreager merged commit 9793551 into main Nov 19, 2025
66 of 68 checks passed

dcreager deleted the dcreager/specialize-from-constraints branch November 19, 2025 19:20

dcreager mentioned this pull request Nov 19, 2025

[ty] Handle nested types when creating specializations from constraint sets #21530

Merged

	# revealed: ty_extensions.Specialization[T@unbounded = int \| str]
	reveal_type(generic_context(unbounded).specialize_constrained(ConstraintSet.range(Never, T, int) \| ConstraintSet.range(Never, T, str)))

		If any of the constraints is a gradual type, we are free to choose any materialization of that
		constraint that makes the test succeed.

[ty] Create a specialization from a constraint set #21414

[ty] Create a specialization from a constraint set #21414

Conversation

dcreager commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

astral-sh-bot bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diagnostic diff on typing conformance tests

Uh oh!

astral-sh-bot bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

carljm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcreager Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dcreager commented Nov 12, 2025 •

edited

Loading

astral-sh-bot bot commented Nov 12, 2025 •

edited

Loading

astral-sh-bot bot commented Nov 12, 2025 •

edited

Loading

`mypy_primer` results

dcreager Nov 19, 2025 •

edited

Loading