Support specialization in impl lookup with a symbolic query/impl. #5169

danakj · 2025-03-22T22:52:43Z

Add a new instruction called ImplSymbolicWitness which represents a search for an impl declaration given a self type and an interface to find implemented for the self type. The self type is stored as a constant instruction id, rather than as a ConstantId, as instructions don't currently support holding ConstantId. The interface is stored as a SpecificInterface but we can't fit all of it directly into the instruction. So we add a new id to refer to the SpecificInterface as follows.

Add a new SpecificInterfaceId which indexes into a canonical value store on SemIR::File. This tracks all SpecificInterfaces stored in an instruction - specifically the ImplSymbolicWitness instruction.

The SpecificInterface on Impl is still stored there as a value, not as an id, and no id is eagerly constructed for it. We wait until an id is needed to make one. Since they are canonical, a new id is only create when a new SpecificInterface value is seen.

When doing impl lookup, and the query is not concrete, and the impl is not effectively final, the query needs to consider future impls that may specialize either the self type or the constaint to make a more precise match and replace the found impl declaration. Instead of returning the ImplWitness instruction from the found impl, we generate a ImplSymbolicWitness instruction, storing the query so that it can be replayed later. This instruction is added to the generic eval block and thus will be re-evaluated later with a SpecificId that may make the query more concrete. When evaluating the instruction and replaying the query, the lookup has the same conditions and if it does not decide to use the found impl concretely, then the same instruction is returned from eval, leaving it as symbolic.

--- Impl lookup changes ---

Impl lookup gets a little more interesting now. It continues to look in the facet value for a witness if the self type is a facet value. Then falls back to looking for an impl declaration. This step is no longer done directly. Instead, we construct a ImplSymbolicWitness instruction and evaluate it immediately for each interface that are in the query facet type.

The ImplSymbolicWitness instruction, when evaluated, calls back to the impl lookup code, with a query specific interface. There we resume back into the same code path as from before, finding a witness in an impl declaration. But we may return "found a non-final impl" instead of a concrete witness. If eval receives this back, it evaluates to the current ImplSymbolicWitness instruction as the resulting constant value.

To pass lookup failures back through eval, a result of InstId::None from the second step of impl lookup will result in a non-constant value, which is used as a signal back up the stack to the original impl lookup function that the lookup failed. Using a non-constant value here would break evaluation of the generic eval block if impl lookup could fail there, however we know it will not since we only leave behind an ImplSymbolicWitness instruction in the eval block if we found at least one matching impl already, and we just want to look for a better match with a more specific query.

We must take care to not store a reference into any value store across computation in impl lookup, since impl lookup can recurse into itself invalidate those stores. That includes the SpecificInterface obtained from a SpecificInterfaceId, which impl lookup also inserts into the store.

--- The long tail ---

Adding a new instruction and a new id type requires a myriad of changes to support them:

We add Dump() support for SpecificInterfaceId. And fix a crash in Dump for SpecificId::None. We also add MakeSpecificInterfaceId() for dumping arbitrary ids.

The type of ImplSymbolicWitness is a new singleton builtin type instruction called WitnessSymbolicType (like WitnessType is the type for an ImplWitness).

Both ImplSymbolicWitness and WitnessSymbolicType are given Value as their expression category as they are builtin constant values. And BuildInfo() in TypeCompleter is taught about them both, returning a ValueRepr::Copy.

WitnessSymbolicType is added to the set of SingletonInstKinds, so that it can have a singleton instrution id as a static member.

Lower's BuildTypeForInst() is taught to make an empty struct for WitnessSymbolicType, similar to WitnessType.

Instruction formatter (FormatterImpl) grows support for printing a SpecificInterfaceId so that it can print both arguments of ImplSymbolicWitness on the RHS when printing the SemIR instruction. To print a SpecificInterfaceId, it prints both the interface id and the specific id (if there is one). For example, for a query on a generic interface Z with one parameter, the RHS includes the query, interface, and specific:

%Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @Z, @Z(%U.as_type) [symbolic]

IdKind is extended to include SpecificInterfaceId.

InstFingerprinter is taught to look through SpecificInterfaceId and use the interface and specific ids in the fingerprint.

InstNamer is taught about SpecificInterfaceId, counting the interfaces when building an index. It is also tought about ImplSymbolicWitness, using the name of the interface within and the .impl_symbolic_witness suffix. For example, here the LHS is named after the interface in the query:

%Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @Z, @Z(%U.as_type) [symbolic]

StringifyTypeExpr is taught about WitnessSymbolicType, which uses its IR name since it's a singleton. And about ImplSymbolicWitness which uses its constant value. The handling of ImplWitnessAccess also needed to be adjusted, since it assumed that ImplWitnessAccess::witness_id would always be a FacetAccessWitness, but it can now also be an ImplSymbolicWitness. (It seems that the witness_id is also assigned ImplWitness instructions, but those ImplWitnessAccess instructions don't ever seem to get stringified in a diagnostic at this time.) At the moment the ImplWitnessAccess with a symbolic witness is just stringified as "", such as in:

x.carbon:1:2: error: cannot implicitly convert value of type `()` to `<symbolic>` [ConversionFailure]
  let a: C(D).(Z.X) = ();
                      ^~

There is a TODO left behind to include more information there.

The TypeStructure builder is made to handle WitnessSymbolicType and WitnessType. These come up now in deduce where a generic impl will have a ImplSymbolicWitness in a FacetValue for a generic self type. The query may have a concrete ImplWitness in the same position. Since deduce tries to deduce through the FacetValue, it tries to convert ImplWitness to ImplSymbolicWitness, tries to do an impl lookup for impl ImplWitness as ImplicitAs(ImplSymbolicWitness) and causes us to build type structures with each of these.

Subst is updated to handle pushing and popping SpecificInterfaceId. Without this, when finishing a generic's eval block, we would walk into the ImplSymbolicWitness instruction, and its arguments, and fail to recurse down into the SpecificInterfaceId. Then any specifics inside would be left as "orphaned" without any generic id attached to them, and we would never update the instructions in the SpecificInterface's instructions (inside its own SpecificId) with new constant values when evaluating the generic eval block against a specific. To do this we push the specific_id inside the SpecificInterface, and when popping we pop the specific_id then construct a new canonical SpecificInterface with it and return that id.

We add support for importing ImplSymbolicWitness by importing its self constant instruction and specific interface id. However we also had to add import support for SpecificImplFunction, which can now appear in the generic eval block for a generic impl declaration, and thus must be imported with the declaration. This is done very similarly to SpecificFunction, except the type_id is a singleton value.

@z

Add a new instruction called ImplSymbolicWitness which represents a search for an impl declaration given a self type and an interface to find implemented for the self type. The self type is stored as a constant instruction id, rather than as a ConstantId, as instructions don't currently support holding ConstantId. The interface is stored as a SpecificInterface but we can't fit all of it directly into the instruction. So we add a new id to refer to the SpecificInterface as follows. Add a new SpecificInterfaceId which indexes into a canonical value store on SemIR::File. This tracks all `SpecificInterface`s stored in an instruction - specifically the ImplSymbolicWitness instruction. The SpecificInterface on Impl is still stored there as a value, not as an id, and no id is eagerly constructed for it. We wait until an id is needed to make one. Since they are canonical, a new id is only create when a new SpecificInterface value is seen. When doing impl lookup, and the query is not concrete, and the impl is not effectively final, the query needs to consider future impls that may specialize either the self type or the constaint to make a more precise match and replace the found impl declaration. Instead of returning the ImplWitness instruction from the found impl, we generate a ImplSymbolicWitness instruction, storing the query so that it can be replayed later. This instruction is added to the generic eval block and thus will be re-evaluated later with a SpecificId that may make the query more concrete. When evaluating the instruction and replaying the query, the lookup has the same conditions and if it does not decide to use the found impl concretely, then the same instruction is returned from eval, leaving it as symbolic. --- Impl lookup changes --- Impl lookup gets a little more interesting now. It continues to look in the facet value for a witness if the self type is a facet value. Then falls back to looking for an impl declaration. This step is no longer done directly. Instead, we construct a ImplSymbolicWitness instruction and evaluate it immediately for each interface that are in the query facet type. The ImplSymbolicWitness instruction, when evaluated, calls back to the impl lookup code, with a query specific interface. There we resume back into the same code path as from before, finding a witness in an impl declaration. But we may return "found a non-final impl" instead of a concrete witness. If eval receives this back, it evaluates to the current ImplSymbolicWitness instruction as the resulting constant value. To pass lookup failures back through eval, a result of InstId::None from the second step of impl lookup will result in a non-constant value, which is used as a signal back up the stack to the original impl lookup function that the lookup failed. Using a non-constant value here would break evaluation of the generic eval block if impl lookup could fail there, however we know it will not since we only leave behind an ImplSymbolicWitness instruction in the eval block if we found at least one matching impl already, and we just want to look for a better match with a more specific query. We must take care to not store a reference into any value store across computation in impl lookup, since impl lookup can recurse into itself invalidate those stores. That includes the SpecificInterface obtained from a SpecificInterfaceId, which impl lookup also inserts into the store. --- The long tail --- Adding a new instruction and a new id type requires a myriad of changes to support them: We add Dump() support for SpecificInterfaceId. And fix a crash in Dump for SpecificId::None. We also add MakeSpecificInterfaceId() for dumping arbitrary ids. The type of ImplSymbolicWitness is a new singleton builtin type instruction called WitnessSymbolicType (like WitnessType is the type for an ImplWitness). Both ImplSymbolicWitness and WitnessSymbolicType are given `Value` as their expression category as they are builtin constant values. And BuildInfo() in TypeCompleter is taught about them both, returning a `ValueRepr::Copy`. WitnessSymbolicType is added to the set of SingletonInstKinds, so that it can have a singleton instrution id as a static member. Lower's BuildTypeForInst() is taught to make an empty struct for WitnessSymbolicType, similar to WitnessType. Instruction formatter (FormatterImpl) grows support for printing a SpecificInterfaceId so that it can print both arguments of ImplSymbolicWitness on the RHS when printing the SemIR instruction. To print a SpecificInterfaceId, it prints both the interface id and the specific id (if there is one). For example, for a query on a generic interface `Z` with one parameter, the RHS includes the query, interface, and specific: ``` %Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @z, @z(%U.as_type) [symbolic] ``` IdKind is extended to include SpecificInterfaceId. InstFingerprinter is taught to look through SpecificInterfaceId and use the interface and specific ids in the fingerprint. InstNamer is taught about SpecificInterfaceId, counting the interfaces when building an index. It is also tought about ImplSymbolicWitness, using the name of the interface within and the `.impl_symbolic_witness` suffix. For example, here the LHS is named after the interface in the query: ``` %Z.impl_symbolic_witness: <symbolic witness> = impl_symbolic_witness %U, @z, @z(%U.as_type) [symbolic] ``` StringifyTypeExpr is taught about WitnessSymbolicType, which uses its IR name since it's a singleton. And about ImplSymbolicWitness which uses its constant value. The handling of ImplWitnessAccess also needed to be adjusted, since it assumed that ImplWitnessAccess::witness_id would always be a FacetAccessWitness, but it can now also be an ImplSymbolicWitness. (It seems that the witness_id is also assigned ImplWitness instructions, but those ImplWitnessAccess instructions don't ever seem to get stringified in a diagnostic at this time.) At the moment the ImplWitnessAccess with a symbolic witness is just stringified as "<symbolic>", such as in: ``` x.carbon:1:2: error: cannot implicitly convert value of type `()` to `<symbolic>` [ConversionFailure] let a: C(D).(Z.X) = (); ^~ ``` There is a TODO left behind to include more information there. The TypeStructure builder is made to handle WitnessSymbolicType and WitnessType. These come up now in deduce where a generic impl will have a ImplSymbolicWitness in a FacetValue for a generic self type. The query may have a concrete ImplWitness in the same position. Since deduce tries to deduce through the FacetValue, it tries to convert ImplWitness to ImplSymbolicWitness, tries to do an impl lookup for `impl ImplWitness as ImplicitAs(ImplSymbolicWitness)` and causes us to build type structures with each of these. Subst is updated to handle pushing and popping SpecificInterfaceId. Without this, when finishing a generic's eval block, we would walk into the ImplSymbolicWitness instruction, and its arguments, and fail to recurse down into the SpecificInterfaceId. Then any specifics inside\ would be left as "orphaned" without any generic id attached to them, and we would never update the instructions in the SpecificInterface's instructions (inside its own SpecificId) with new constant values when evaluating the generic eval block against a specific. To do this we push the specific_id inside the SpecificInterface, and when popping we pop the specific_id then construct a new canonical SpecificInterface with it and return that id. We add support for importing ImplSymbolicWitness by importing its self constant instruction and specific interface id. However we also had to add import support for SpecificImplFunction, which can now appear in the generic eval block for a generic impl declaration, and thus must be imported with the declaration. This is done very similarly to SpecificFunction, except the `type_id` is a singleton value.

danakj · 2025-03-22T22:55:20Z

Hm, deduce_nested_generic_class test stopped working for some reason. Will look next week.

danakj · 2025-03-22T22:57:37Z

Hm, deduce_nested_generic_class test stopped working for some reason. Will look next week.

Maybe ImplWitness needs to convert to ImplSymbolicWitness, but that sounds weird, or they need to match somehow in deduce.

jonmeow

Generally looks good. I'm not sure if you want someone more familiar with the generics half, but this looks like the discussion I was listening in on.

toolchain/sem_ir/dump.cpp

toolchain/sem_ir/id_kind.h

toolchain/sem_ir/stringify_type.cpp

toolchain/sem_ir/typed_insts.h

toolchain/check/impl_lookup.cpp

jonmeow · 2025-03-24T17:02:57Z

To note, the PR description is pretty lengthy -- on a skim it looks consistent, but I may be missing details.

danakj · 2025-03-24T17:41:01Z

Thanks! All great feedbacks. I am going to try understand/get the test to pass and will then address it all.

As long as the types (FacetTypes) of the FacetValue match, any witnesses that are present should be accepted for deduce. If the parameter's witnesses are concrete, the argument's will also be concrete and the exact same values. This is what we have relied on so far. If the parameter's witnesses are symbolic, the argument may have either symbolic or concrete witnesses. If they are symbolic then they are equal. If they are concrete then they are not equal (and will fail deduction if compared. But the argument FacetValue with a concrete witness brings more information than the parameter has, and can be used in place of the parameter's FacetValue. All FacetValues have _some_ witness for every interface in the facet's type (FacetType), so there's no question of whether a witness is present or not between the parameter and argument. This ensures that the deduction test with a nested FacetValue in a specific block continues to pass when generic witness resolution can produce a FacetValue with a symbolic witness in carbon-language#5169. See issue carbon-language#5169 for more details.

danakj · 2025-03-25T18:08:16Z

Hm, deduce_nested_generic_class test stopped working for some reason. Will look next week.

Maybe ImplWitness needs to convert to ImplSymbolicWitness, but that sounds weird, or they need to match somehow in deduce.

This issue is recorded and discussed in #5178. A PR to resolve it is separated out in #5179, which this PR will now depend on.

As long as the types (FacetTypes) of the FacetValue match, any witnesses that are present should be accepted for deduce. If the parameter's witnesses are concrete, the argument's will also be concrete and the exact same values. This is what we have relied on so far. If the parameter's witnesses are symbolic, the argument may have either symbolic or concrete witnesses. If they are symbolic then they are equal. If they are concrete then they are not equal (and will fail deduction if compared. But the argument FacetValue with a concrete witness brings more information than the parameter has, and can be used in place of the parameter's FacetValue. All FacetValues have _some_ witness for every interface in the facet's type (FacetType), so there's no question of whether a witness is present or not between the parameter and argument. This ensures that the deduction test with a nested FacetValue in a specific block continues to pass when generic witness resolution can produce a FacetValue with a symbolic witness in carbon-language#5169. See issue carbon-language#5169 for more details.

toolchain/sem_ir/stringify_type.cpp

toolchain/sem_ir/typed_insts.h

Co-authored-by: Jon Ross-Perkins <[email protected]>

As long as the types (FacetTypes) of the FacetValue match, any witnesses that are present should be accepted for deduce. If the parameter's witnesses are concrete, the argument's will also be concrete and the exact same values. This is what we have relied on so far. If the parameter's witnesses are symbolic, the argument may have either symbolic or concrete witnesses. If they are symbolic then they are equal. If they are concrete then they are not equal (and will fail deduction if compared. But the argument FacetValue with a concrete witness brings more information than the parameter has, and can be used in place of the parameter's FacetValue. All FacetValues have _some_ witness for every interface in the facet's type (FacetType), so there's no question of whether a witness is present or not between the parameter and argument. This ensures that the deduction test with a nested FacetValue in a specific block continues to pass when generic witness resolution can produce a FacetValue with a symbolic witness in carbon-language#5169. See issue carbon-language#5169 for more details.

danakj

Thanks! PTAL

toolchain/sem_ir/dump.cpp

toolchain/sem_ir/id_kind.h

toolchain/sem_ir/stringify_type.cpp

toolchain/check/impl_lookup.cpp

toolchain/check/impl_lookup.h

toolchain/check/impl_lookup.cpp

As long as the types (FacetTypes) of the FacetValue match, any witnesses that are present should be accepted for deduce. If the parameter's witnesses are concrete, the argument's will also be concrete and the exact same values. This is what we have relied on so far. If the parameter's witnesses are symbolic, the argument may have either symbolic or concrete witnesses. If they are symbolic then they are equal. If they are concrete then they are not equal (and will fail deduction if compared. But the argument FacetValue with a concrete witness brings more information than the parameter has, and can be used in place of the parameter's FacetValue. All FacetValues have _some_ witness for every interface in the facet's type (FacetType), so there's no question of whether a witness is present or not between the parameter and argument. This ensures that the deduction test with a nested FacetValue in a specific block continues to pass when generic witness resolution can produce a FacetValue with a symbolic witness in carbon-language#5169. See issue carbon-language#5169 for more details.

danakj · 2025-03-25T20:06:49Z

I think I need to merge and autoupdate to make tests pass on HEAD, but I will avoid cluttering the review with that for now.

jonmeow

Looks good to me. I see there's still a failing test, but my comments are pretty small so feel free to merge after the last few things.

toolchain/check/impl_lookup.cpp

danakj · 2025-03-26T14:31:09Z

Thanks for the review!

danakj added 2 commits March 22, 2025 18:49

reorganize-impl-lookup

98b0f29

github-actions bot added the toolchain label Mar 22, 2025

github-actions bot requested a review from jonmeow March 22, 2025 22:53

jonmeow reviewed Mar 24, 2025

View reviewed changes

danakj mentioned this pull request Mar 25, 2025

Deduce through FacetValue should not consider witnesses #5178

Closed

danakj mentioned this pull request Mar 25, 2025

Don't compare witnesses when deducing through FacetValue #5179

Closed

cleaner-result

1f12926

danakj force-pushed the symbolic-witness branch from 84c3894 to a315717 Compare March 25, 2025 18:15

CarbonInfraBot reviewed Mar 25, 2025

View reviewed changes

toolchain/sem_ir/stringify_type.cpp Outdated Show resolved Hide resolved

CarbonInfraBot reviewed Mar 25, 2025

View reviewed changes

toolchain/sem_ir/typed_insts.h Outdated Show resolved Hide resolved

danakj and others added 9 commits March 25, 2025 15:07

no-trygetas

63a82a5

Co-authored-by: Jon Ross-Perkins <[email protected]>

change-tense

272319c

Co-authored-by: Jon Ross-Perkins <[email protected]>

llvm-stable-sort

57300f0

Co-authored-by: Jon Ross-Perkins <[email protected]>

name-the-lookup-query

26d7ca8

Co-authored-by: Jon Ross-Perkins <[email protected]>

semir-loc-todo-reword

3dc1d00

Co-authored-by: Jon Ross-Perkins <[email protected]>

SemIR

84d4bda

Co-authored-by: Jon Ross-Perkins <[email protected]>

format

86c7b37

all-factory-methods-no-implicit-ctor

e5ffe75

review

0f19a42

danakj force-pushed the symbolic-witness branch from 77e4818 to bcce35c Compare March 25, 2025 19:07

danakj requested a review from jonmeow March 25, 2025 19:07

danakj commented Mar 25, 2025

View reviewed changes

test-a-deduced-call-with-symbolic-witness-argument

b9d26eb

danakj force-pushed the symbolic-witness branch from bcce35c to 4f6982d Compare March 25, 2025 19:19

jonmeow approved these changes Mar 25, 2025

View reviewed changes

toolchain/check/impl_lookup.cpp Show resolved Hide resolved

toolchain/check/impl_lookup.cpp Outdated Show resolved Hide resolved

toolchain/check/impl_lookup.cpp Show resolved Hide resolved

remove witness-symbolic-type

Loading
Loading status checks…

e987aa5

danakj force-pushed the symbolic-witness branch from 4f6982d to e987aa5 Compare March 25, 2025 22:07

danakj added 2 commits March 26, 2025 10:24

review

Loading
Loading status checks…

9a607c6

Merge branch 'trunk' into symbolic-witness

Loading
Loading status checks…

69ec4f7

danakj enabled auto-merge March 26, 2025 14:31

danakj added this pull request to the merge queue Mar 26, 2025

Merged via the queue into carbon-language:trunk with commit 53c98a8 Mar 26, 2025
8 checks passed

danakj deleted the symbolic-witness branch March 26, 2025 15:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support specialization in impl lookup with a symbolic query/impl. #5169

Support specialization in impl lookup with a symbolic query/impl. #5169

danakj commented Mar 22, 2025

danakj commented Mar 22, 2025

danakj commented Mar 22, 2025

jonmeow left a comment

jonmeow commented Mar 24, 2025

danakj commented Mar 24, 2025

danakj commented Mar 25, 2025

danakj left a comment

danakj commented Mar 25, 2025

jonmeow left a comment

danakj commented Mar 26, 2025

Support specialization in impl lookup with a symbolic query/impl. #5169

Support specialization in impl lookup with a symbolic query/impl. #5169

Conversation

danakj commented Mar 22, 2025

danakj commented Mar 22, 2025

danakj commented Mar 22, 2025

jonmeow left a comment

Choose a reason for hiding this comment

jonmeow commented Mar 24, 2025

danakj commented Mar 24, 2025

danakj commented Mar 25, 2025

danakj left a comment

Choose a reason for hiding this comment

danakj commented Mar 25, 2025

jonmeow left a comment

Choose a reason for hiding this comment

danakj commented Mar 26, 2025