Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support specialization in impl lookup with a symbolic query/impl. #5169

Merged
merged 16 commits into from
Mar 26, 2025

Conversation

danakj
Copy link
Contributor

@danakj danakj commented Mar 22, 2025

Add a new instruction called ImplSymbolicWitness which represents a search for an impl declaration given a self type and an interface to find implemented for the self type. The self type is stored as a constant instruction id, rather than as a ConstantId, as instructions don't currently support holding ConstantId. The interface is stored as a SpecificInterface but we can't fit all of it directly into the instruction. So we add a new id to refer to the SpecificInterface as follows.

Add a new SpecificInterfaceId which indexes into a canonical value store on SemIR::File. This tracks all SpecificInterfaces stored in an instruction - specifically the ImplSymbolicWitness instruction.

The SpecificInterface on Impl is still stored there as a value, not as an id, and no id is eagerly constructed for it. We wait until an id is needed to make one. Since they are canonical, a new id is only create when a new SpecificInterface value is seen.

When doing impl lookup, and the query is not concrete, and the impl is not effectively final, the query needs to consider future impls that may specialize either the self type or the constaint to make a more precise match and replace the found impl declaration. Instead of returning the ImplWitness instruction from the found impl, we generate a ImplSymbolicWitness instruction, storing the query so that it can be replayed later. This instruction is added to the generic eval block and thus will be re-evaluated later with a SpecificId that may make the query more concrete. When evaluating the instruction and replaying the query, the lookup has the same conditions and if it does not decide to use the found impl concretely, then the same instruction is returned from eval, leaving it as symbolic.

--- Impl lookup changes ---

Impl lookup gets a little more interesting now. It continues to look in the facet value for a witness if the self type is a facet value. Then falls back to looking for an impl declaration. This step is no longer done directly. Instead, we construct a ImplSymbolicWitness instruction and evaluate it immediately for each interface that are in the query facet type.

The ImplSymbolicWitness instruction, when evaluated, calls back to the impl lookup code, with a query specific interface. There we resume back into the same code path as from before, finding a witness in an impl declaration. But we may return "found a non-final impl" instead of a concrete witness. If eval receives this back, it evaluates to the current ImplSymbolicWitness instruction as the resulting constant value.

To pass lookup failures back through eval, a result of InstId::None from the second step of impl lookup will result in a non-constant value, which is used as a signal back up the stack to the original impl lookup function that the lookup failed. Using a non-constant value here would break evaluation of the generic eval block if impl lookup could fail there, however we know it will not since we only leave behind an ImplSymbolicWitness instruction in the eval block if we found at least one matching impl already, and we just want to look for a better match with a more specific query.

We must take care to not store a reference into any value store across computation in impl lookup, since impl lookup can recurse into itself invalidate those stores. That includes the SpecificInterface obtained from a SpecificInterfaceId, which impl lookup also inserts into the store.

--- The long tail ---

Adding a new instruction and a new id type requires a myriad of changes to support them:

We add Dump() support for SpecificInterfaceId. And fix a crash in Dump for SpecificId::None. We also add MakeSpecificInterfaceId() for dumping arbitrary ids.

The type of ImplSymbolicWitness is a new singleton builtin type instruction called WitnessSymbolicType (like WitnessType is the type for an ImplWitness).

Both ImplSymbolicWitness and WitnessSymbolicType are given Value as their expression category as they are builtin constant values. And BuildInfo() in TypeCompleter is taught about them both, returning a ValueRepr::Copy.

WitnessSymbolicType is added to the set of SingletonInstKinds, so that it can have a singleton instrution id as a static member.

Lower's BuildTypeForInst() is taught to make an empty struct for WitnessSymbolicType, similar to WitnessType.

Instruction formatter (FormatterImpl) grows support for printing a SpecificInterfaceId so that it can print both arguments of ImplSymbolicWitness on the RHS when printing the SemIR instruction. To print a SpecificInterfaceId, it prints both the interface id and the specific id (if there is one). For example, for a query on a generic interface Z with one parameter, the RHS includes the query, interface, and specific:

%Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @Z, @Z(%U.as_type) [symbolic]

IdKind is extended to include SpecificInterfaceId.

InstFingerprinter is taught to look through SpecificInterfaceId and use the interface and specific ids in the fingerprint.

InstNamer is taught about SpecificInterfaceId, counting the interfaces when building an index. It is also tought about ImplSymbolicWitness, using the name of the interface within and the .impl_symbolic_witness suffix. For example, here the LHS is named after the interface in the query:

%Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @Z, @Z(%U.as_type) [symbolic]

StringifyTypeExpr is taught about WitnessSymbolicType, which uses its IR name since it's a singleton. And about ImplSymbolicWitness which uses its constant value. The handling of ImplWitnessAccess also needed to be adjusted, since it assumed that ImplWitnessAccess::witness_id would always be a FacetAccessWitness, but it can now also be an ImplSymbolicWitness. (It seems that the witness_id is also assigned ImplWitness instructions, but those ImplWitnessAccess instructions don't ever seem to get stringified in a diagnostic at this time.) At the moment the ImplWitnessAccess with a symbolic witness is just stringified as "", such as in:

x.carbon:1:2: error: cannot implicitly convert value of type `()` to `<symbolic>` [ConversionFailure]
  let a: C(D).(Z.X) = ();
                      ^~

There is a TODO left behind to include more information there.

The TypeStructure builder is made to handle WitnessSymbolicType and WitnessType. These come up now in deduce where a generic impl will have a ImplSymbolicWitness in a FacetValue for a generic self type. The query may have a concrete ImplWitness in the same position. Since deduce tries to deduce through the FacetValue, it tries to convert ImplWitness to ImplSymbolicWitness, tries to do an impl lookup for impl ImplWitness as ImplicitAs(ImplSymbolicWitness) and causes us to build type structures with each of these.

Subst is updated to handle pushing and popping SpecificInterfaceId. Without this, when finishing a generic's eval block, we would walk into the ImplSymbolicWitness instruction, and its arguments, and fail to recurse down into the SpecificInterfaceId. Then any specifics inside would be left as "orphaned" without any generic id attached to them, and we would never update the instructions in the SpecificInterface's instructions (inside its own SpecificId) with new constant values when evaluating the generic eval block against a specific. To do this we push the specific_id inside the SpecificInterface, and when popping we pop the specific_id then construct a new canonical SpecificInterface with it and return that id.

We add support for importing ImplSymbolicWitness by importing its self constant instruction and specific interface id. However we also had to add import support for SpecificImplFunction, which can now appear in the generic eval block for a generic impl declaration, and thus must be imported with the declaration. This is done very similarly to SpecificFunction, except the type_id is a singleton value.

danakj added 2 commits March 22, 2025 18:49
Add a new instruction called ImplSymbolicWitness which represents a
search for an impl declaration given a self type and an interface to
find implemented for the self type. The self type is stored as a
constant instruction id, rather than as a ConstantId, as instructions
don't currently support holding ConstantId. The interface is stored as
a SpecificInterface but we can't fit all of it directly into the
instruction. So we add a new id to refer to the SpecificInterface as
follows.

Add a new SpecificInterfaceId which indexes into a canonical value store
on SemIR::File. This tracks all `SpecificInterface`s stored in an
instruction - specifically the ImplSymbolicWitness instruction.

The SpecificInterface on Impl is still stored there as a value, not as
an id, and no id is eagerly constructed for it. We wait until an id is
needed to make one. Since they are canonical, a new id is only create
when a new SpecificInterface value is seen.

When doing impl lookup, and the query is not concrete, and the impl is
not effectively final, the query needs to consider future impls that may
specialize either the self type or the constaint to make a more precise
match and replace the found impl declaration. Instead of returning the
ImplWitness instruction from the found impl, we generate a
ImplSymbolicWitness instruction, storing the query so that it can be
replayed later. This instruction is added to the generic eval block and
thus will be re-evaluated later with a SpecificId that may make the
query more concrete. When evaluating the instruction and replaying the
query, the lookup has the same conditions and if it does not decide to
use the found impl concretely, then the same instruction is returned
from eval, leaving it as symbolic.

--- Impl lookup changes ---

Impl lookup gets a little more interesting now. It continues to look
in the facet value for a witness if the self type is a facet value. Then
falls back to looking for an impl declaration. This step is no longer
done directly. Instead, we construct a ImplSymbolicWitness instruction
and evaluate it immediately for each interface that are in the query
facet type.

The ImplSymbolicWitness instruction, when evaluated, calls back to the
impl lookup code, with a query specific interface. There we resume back
into the same code path as from before, finding a witness in an impl
declaration. But we may return "found a non-final impl" instead of
a concrete witness. If eval receives this back, it evaluates to the
current ImplSymbolicWitness instruction as the resulting constant value.

To pass lookup failures back through eval, a result of InstId::None from
the second step of impl lookup will result in a non-constant value,
which is used as a signal back up the stack to the original impl lookup
function that the lookup failed. Using a non-constant value here would
break evaluation of the generic eval block if impl lookup could fail
there, however we know it will not since we only leave behind an
ImplSymbolicWitness instruction in the eval block if we found at least
one matching impl already, and we just want to look for a better match
with a more specific query.

We must take care to not store a reference into any value store across
computation in impl lookup, since impl lookup can recurse into itself
invalidate those stores. That includes the SpecificInterface obtained
from a SpecificInterfaceId, which impl lookup also inserts into the
store.

--- The long tail ---

Adding a new instruction and a new id type requires a myriad of changes
to support them:

We add Dump() support for SpecificInterfaceId. And fix a crash in Dump
for SpecificId::None. We also add MakeSpecificInterfaceId() for dumping
arbitrary ids.

The type of ImplSymbolicWitness is a new singleton builtin type
instruction called WitnessSymbolicType (like WitnessType is the type
for an ImplWitness).

Both ImplSymbolicWitness and WitnessSymbolicType are given `Value` as
their expression category as they are builtin constant values. And
BuildInfo() in TypeCompleter is taught about them both,
returning a `ValueRepr::Copy`.

WitnessSymbolicType is added to the set of SingletonInstKinds, so that
it can have a singleton instrution id as a static member.

Lower's BuildTypeForInst() is taught to make an empty struct for
WitnessSymbolicType, similar to WitnessType.

Instruction formatter (FormatterImpl) grows support for printing
a SpecificInterfaceId so that it can print both arguments of
ImplSymbolicWitness on the RHS when printing the SemIR instruction.
To print a SpecificInterfaceId, it prints both the interface id and the
specific id (if there is one). For example, for a query on a generic
interface `Z` with one parameter, the RHS includes the query, interface,
and specific:
```
%Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @z, @z(%U.as_type) [symbolic]
```

IdKind is extended to include SpecificInterfaceId.

InstFingerprinter is taught to look through SpecificInterfaceId and use
the interface and specific ids in the fingerprint.

InstNamer is taught about SpecificInterfaceId, counting the interfaces
when building an index. It is also tought about ImplSymbolicWitness,
using the name of the interface within and the `.impl_symbolic_witness`
suffix. For example, here the LHS is named after the interface in the
query:
```
%Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @z, @z(%U.as_type) [symbolic]
```

StringifyTypeExpr is taught about WitnessSymbolicType, which uses its IR
name since it's a singleton. And about ImplSymbolicWitness which uses
its constant value. The handling of ImplWitnessAccess also needed to be
adjusted, since it assumed that ImplWitnessAccess::witness_id would
always be a FacetAccessWitness, but it can now also be an
ImplSymbolicWitness. (It seems that the witness_id is also assigned
ImplWitness instructions, but those ImplWitnessAccess instructions don't
ever seem to get stringified in a diagnostic at this time.) At the
moment the ImplWitnessAccess with a symbolic witness is just stringified
as "<symbolic>", such as in:
```
x.carbon:1:2: error: cannot implicitly convert value of type `()` to `<symbolic>` [ConversionFailure]
  let a: C(D).(Z.X) = ();
                      ^~
```

There is a TODO left behind to include more information there.

The TypeStructure builder is made to handle WitnessSymbolicType and
WitnessType. These come up now in deduce where a generic impl will have
a ImplSymbolicWitness in a FacetValue for a generic self type. The query
may have a concrete ImplWitness in the same position. Since deduce tries
to deduce through the FacetValue, it tries to convert ImplWitness to
ImplSymbolicWitness, tries to do an impl lookup for
`impl ImplWitness as ImplicitAs(ImplSymbolicWitness)` and causes us to
build type structures with each of these.

Subst is updated to handle pushing and popping SpecificInterfaceId.
Without this, when finishing a generic's eval block, we would walk into
the ImplSymbolicWitness instruction, and its arguments, and fail to
recurse down into the SpecificInterfaceId. Then any specifics inside\
would be left as "orphaned" without any generic id attached to them, and
we would never update the instructions in the SpecificInterface's
instructions (inside its own SpecificId) with new constant values
when evaluating the generic eval block against a specific. To do this
we push the specific_id inside the SpecificInterface, and when popping
we pop the specific_id then construct a new canonical SpecificInterface
with it and return that id.

We add support for importing ImplSymbolicWitness by importing its self
constant instruction and specific interface id. However we also had to
add import support for SpecificImplFunction, which can now appear in
the generic eval block for a generic impl declaration, and thus must
be imported with the declaration. This is done very similarly to
SpecificFunction, except the `type_id` is a singleton value.
@github-actions github-actions bot requested a review from jonmeow March 22, 2025 22:53
@danakj
Copy link
Contributor Author

danakj commented Mar 22, 2025

Hm, deduce_nested_generic_class test stopped working for some reason. Will look next week.

@danakj
Copy link
Contributor Author

danakj commented Mar 22, 2025

Hm, deduce_nested_generic_class test stopped working for some reason. Will look next week.

Maybe ImplWitness needs to convert to ImplSymbolicWitness, but that sounds weird, or they need to match somehow in deduce.

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. I'm not sure if you want someone more familiar with the generics half, but this looks like the discussion I was listening in on.

@jonmeow
Copy link
Contributor

jonmeow commented Mar 24, 2025

To note, the PR description is pretty lengthy -- on a skim it looks consistent, but I may be missing details.

@danakj
Copy link
Contributor Author

danakj commented Mar 24, 2025

Thanks! All great feedbacks. I am going to try understand/get the test to pass and will then address it all.

danakj added a commit to danakj/carbon-lang that referenced this pull request Mar 25, 2025
As long as the types (FacetTypes) of the FacetValue match, any witnesses
that are present should be accepted for deduce.

If the parameter's witnesses are concrete, the argument's will also be
concrete and the exact same values. This is what we have relied on so
far.

If the parameter's witnesses are symbolic, the argument may have either
symbolic or concrete witnesses. If they are symbolic then they are
equal. If they are concrete then they are not equal (and will fail
deduction if compared. But the argument FacetValue with a concrete
witness brings more information than the parameter has, and can be
used in place of the parameter's FacetValue.

All FacetValues have _some_ witness for every interface in the facet's
type (FacetType), so there's no question of whether a witness is present
or not between the parameter and argument.

This ensures that the deduction test with a nested FacetValue in a
specific block continues to pass when generic witness resolution can
produce a FacetValue with a symbolic witness in carbon-language#5169.

See issue carbon-language#5169 for more details.
danakj added a commit to danakj/carbon-lang that referenced this pull request Mar 25, 2025
As long as the types (FacetTypes) of the FacetValue match, any witnesses
that are present should be accepted for deduce.

If the parameter's witnesses are concrete, the argument's will also be
concrete and the exact same values. This is what we have relied on so
far.

If the parameter's witnesses are symbolic, the argument may have either
symbolic or concrete witnesses. If they are symbolic then they are
equal. If they are concrete then they are not equal (and will fail
deduction if compared. But the argument FacetValue with a concrete
witness brings more information than the parameter has, and can be
used in place of the parameter's FacetValue.

All FacetValues have _some_ witness for every interface in the facet's
type (FacetType), so there's no question of whether a witness is present
or not between the parameter and argument.

This ensures that the deduction test with a nested FacetValue in a
specific block continues to pass when generic witness resolution can
produce a FacetValue with a symbolic witness in carbon-language#5169.

See issue carbon-language#5169 for more details.
@danakj
Copy link
Contributor Author

danakj commented Mar 25, 2025

Hm, deduce_nested_generic_class test stopped working for some reason. Will look next week.

Maybe ImplWitness needs to convert to ImplSymbolicWitness, but that sounds weird, or they need to match somehow in deduce.

This issue is recorded and discussed in #5178. A PR to resolve it is separated out in #5179, which this PR will now depend on.

danakj added a commit to danakj/carbon-lang that referenced this pull request Mar 25, 2025
As long as the types (FacetTypes) of the FacetValue match, any witnesses
that are present should be accepted for deduce.

If the parameter's witnesses are concrete, the argument's will also be
concrete and the exact same values. This is what we have relied on so
far.

If the parameter's witnesses are symbolic, the argument may have either
symbolic or concrete witnesses. If they are symbolic then they are
equal. If they are concrete then they are not equal (and will fail
deduction if compared. But the argument FacetValue with a concrete
witness brings more information than the parameter has, and can be
used in place of the parameter's FacetValue.

All FacetValues have _some_ witness for every interface in the facet's
type (FacetType), so there's no question of whether a witness is present
or not between the parameter and argument.

This ensures that the deduction test with a nested FacetValue in a
specific block continues to pass when generic witness resolution can
produce a FacetValue with a symbolic witness in carbon-language#5169.

See issue carbon-language#5169 for more details.
@danakj danakj force-pushed the symbolic-witness branch from 84c3894 to a315717 Compare March 25, 2025 18:15
danakj and others added 9 commits March 25, 2025 15:07
Co-authored-by: Jon Ross-Perkins <[email protected]>
Co-authored-by: Jon Ross-Perkins <[email protected]>
Co-authored-by: Jon Ross-Perkins <[email protected]>
Co-authored-by: Jon Ross-Perkins <[email protected]>
Co-authored-by: Jon Ross-Perkins <[email protected]>
Co-authored-by: Jon Ross-Perkins <[email protected]>
danakj added a commit to danakj/carbon-lang that referenced this pull request Mar 25, 2025
As long as the types (FacetTypes) of the FacetValue match, any witnesses
that are present should be accepted for deduce.

If the parameter's witnesses are concrete, the argument's will also be
concrete and the exact same values. This is what we have relied on so
far.

If the parameter's witnesses are symbolic, the argument may have either
symbolic or concrete witnesses. If they are symbolic then they are
equal. If they are concrete then they are not equal (and will fail
deduction if compared. But the argument FacetValue with a concrete
witness brings more information than the parameter has, and can be
used in place of the parameter's FacetValue.

All FacetValues have _some_ witness for every interface in the facet's
type (FacetType), so there's no question of whether a witness is present
or not between the parameter and argument.

This ensures that the deduction test with a nested FacetValue in a
specific block continues to pass when generic witness resolution can
produce a FacetValue with a symbolic witness in carbon-language#5169.

See issue carbon-language#5169 for more details.
@danakj danakj force-pushed the symbolic-witness branch from 77e4818 to bcce35c Compare March 25, 2025 19:07
@danakj danakj requested a review from jonmeow March 25, 2025 19:07
Copy link
Contributor Author

@danakj danakj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! PTAL

danakj added a commit to danakj/carbon-lang that referenced this pull request Mar 25, 2025
As long as the types (FacetTypes) of the FacetValue match, any witnesses
that are present should be accepted for deduce.

If the parameter's witnesses are concrete, the argument's will also be
concrete and the exact same values. This is what we have relied on so
far.

If the parameter's witnesses are symbolic, the argument may have either
symbolic or concrete witnesses. If they are symbolic then they are
equal. If they are concrete then they are not equal (and will fail
deduction if compared. But the argument FacetValue with a concrete
witness brings more information than the parameter has, and can be
used in place of the parameter's FacetValue.

All FacetValues have _some_ witness for every interface in the facet's
type (FacetType), so there's no question of whether a witness is present
or not between the parameter and argument.

This ensures that the deduction test with a nested FacetValue in a
specific block continues to pass when generic witness resolution can
produce a FacetValue with a symbolic witness in carbon-language#5169.

See issue carbon-language#5169 for more details.
@danakj danakj force-pushed the symbolic-witness branch from bcce35c to 4f6982d Compare March 25, 2025 19:19
@danakj
Copy link
Contributor Author

danakj commented Mar 25, 2025

I think I need to merge and autoupdate to make tests pass on HEAD, but I will avoid cluttering the review with that for now.

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I see there's still a failing test, but my comments are pretty small so feel free to merge after the last few things.

@danakj danakj force-pushed the symbolic-witness branch from 4f6982d to e987aa5 Compare March 25, 2025 22:07
danakj added 2 commits March 26, 2025 10:24
@danakj
Copy link
Contributor Author

danakj commented Mar 26, 2025

Thanks for the review!

@danakj danakj enabled auto-merge March 26, 2025 14:31
@danakj danakj added this pull request to the merge queue Mar 26, 2025
Merged via the queue into carbon-language:trunk with commit 53c98a8 Mar 26, 2025
8 checks passed
@danakj danakj deleted the symbolic-witness branch March 26, 2025 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants