Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stream] Update executable functions in encoding specialization pass. #19700

Merged
merged 3 commits into from
Feb 3, 2025

Conversation

hanhanW
Copy link
Contributor

@hanhanW hanhanW commented Jan 14, 2025

For the stream bindings, the encoding types of duplicated executables need to be updated with resolved layouts. In the revision, it is done by type interface. The revision introduces EncodingTypeInterface in the IREE::Encoding dialect. There are two interface methods:

  • getEncodingType: returns the tensor type with the encoding. E.g., the bounded type is returned in the Flow::DispatchTensorType implementation.
  • updateEncoding: returns the same type but with the new encoding. E.g., the encoding in the bounded type is updated to the new encoding in the Flow::DispatchTensorType implementation.

The revision implements the interface methods for
Flow::DispatchTensorType and uses them to update the bindings.

Codegen already looks at flow.dispatch.tensor.load/store and will have the incoming layout information available. If there is a layout transferring need, codegen should be able to generate relayout ops when they materialize the encodings. E.g., the incoming layout is attached and codegen knows the target layout. It can generate relayout ops to bring the incoming layout to the target layout.

#encoding3 =  ...operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  layouts = [#iree_cpu.cpu_encoding_layout<configuration = {
    encoding_info = {innerDimsPos = [0, 1],
                     innerTileSizes = [16, 1],
                     outerDimsPerm = [0, 1]}}>]
>
#encoding6 =  ...operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  user_indexing_maps = [#map, #map1, #map2]
>
%7 = flow.dispatch.tensor.load %4,
  offsets = [0, 0], sizes = [%2, %0], strides = [1, 1]
  : !flow.dispatch.tensor<readonly:tensor<?x?xf32, #encoding3>>{%2, %0}
  -> tensor<?x?xf32, #encoding6>

@hanhanW
Copy link
Contributor Author

hanhanW commented Jan 14, 2025

It depends on #19527, and it is ready for review.

@hanhanW hanhanW force-pushed the specialize-encodings-5-n branch from a1e7d8f to bc4cbc0 Compare January 14, 2025 18:30
@hanhanW
Copy link
Contributor Author

hanhanW commented Jan 15, 2025

cc @bjacob @Max191 Heads-up. In the future, I expect to see something like below. In my encoding specialization pass, we resolve layouts and attach them to bindings. I think here would be where the layout transfer happens. If the request layout is different (i.e., the load shape is different from the source shape), we'll need to generate relayout ops in codegen.

  operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  layouts = [#iree_cpu.cpu_encoding_layout<configuration = {
    encoding_info = {innerDimsPos = [0, 1],
                     innerTileSizes = [16, 1],
                     outerDimsPerm = [0, 1]}}>]
>
  operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  user_indexing_maps = [#map, #map1, #map2]
>
%7 = flow.dispatch.tensor.load %4,
  offsets = [0, 0], sizes = [%2, %0], strides = [1, 1]
  : !flow.dispatch.tensor<readonly:tensor<?x?xf32, #encoding3>>{%2, %0}
  -> tensor<?x?xf32, #encoding6>

@hanhanW hanhanW force-pushed the users/hanhanW/specialize-encodings-2-n branch from c2756a0 to d80e377 Compare January 15, 2025 08:28
@hanhanW hanhanW force-pushed the specialize-encodings-5-n branch from bc4cbc0 to 098e173 Compare January 15, 2025 08:28
@hanhanW hanhanW marked this pull request as draft January 24, 2025 13:45
@hanhanW hanhanW changed the base branch from users/hanhanW/specialize-encodings-2-n to users/hanhanW/specialize-encodings-dup-2 January 24, 2025 15:03
@hanhanW hanhanW force-pushed the specialize-encodings-5-n branch from c5e737b to fa69c58 Compare January 24, 2025 15:04
@hanhanW hanhanW force-pushed the specialize-encodings-5-n branch 3 times, most recently from 9c827e1 to ec2be7c Compare February 3, 2025 06:50
@hanhanW hanhanW changed the base branch from users/hanhanW/specialize-encodings-dup-2 to main February 3, 2025 06:51
The duplicated executables need to update the encoding types with
resolved layouts for those stream bindings. In the revision, it is done
by type interface. The revision introduces EncodingTypeInterface in the
IREE::Encoding dialect. There are two interface methods:

- getEncodingType: returns the tensor type with the encoding. E.g., the
  bounded type is returned in the Flow::DispatchTensorType
  implementation.
- updateEncoding: returns the same type but with the new encoding. E.g.,
  the encoding in the bounded type is updated to the new encoding in the
  Flow::DispatchTensorType implementation.

The revision implements the interface methods for
Flow::DispatchTensorType and uses them to update the bindings.

Codegen already looks at flow.dispatch.tensor.load/store and will have
the incoming layout information available. If there is a layout
transferring need, codegen should be able to generate relayout ops when
they materialize the encodings. E.g., the incoming layout is attached
and codegen knows the target layout. It can generate relayout ops to
bring the incoming layout to the target layout.

```
  operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  layouts = [#iree_cpu.cpu_encoding_layout<configuration = {
    encoding_info = {innerDimsPos = [0, 1],
                     innerTileSizes = [16, 1],
                     outerDimsPerm = [0, 1]}}>]
>
  operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  user_indexing_maps = [#map, #map1, #map2]
>
%7 = flow.dispatch.tensor.load %4,
  offsets = [0, 0], sizes = [%2, %0], strides = [1, 1]
  : !flow.dispatch.tensor<readonly:tensor<?x?xf32, #encoding3>>{%2, %0}
  -> tensor<?x?xf32, #encoding6>
```

Signed-off-by: hanhanW <[email protected]>
@hanhanW hanhanW force-pushed the specialize-encodings-5-n branch from ec2be7c to 2c25a19 Compare February 3, 2025 07:01
@hanhanW hanhanW marked this pull request as ready for review February 3, 2025 07:03
@hanhanW
Copy link
Contributor Author

hanhanW commented Feb 3, 2025

@benvanik this is ready for review. In the meantime, I'm working on the testing encoding attribute, so we do not need to rely on VMVX/CPU ones.

@hanhanW hanhanW requested a review from kuhar February 3, 2025 09:12
@hanhanW
Copy link
Contributor Author

hanhanW commented Feb 3, 2025

@benvanik this is ready for review. In the meantime, I'm working on the testing encoding attribute, so we do not need to rely on VMVX/CPU ones.

#19879 is up for review. I'll rebase and update the tests once it's landed.

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, and your new test encoding attrs will clean up the tests nicely!

@hanhanW
Copy link
Contributor Author

hanhanW commented Feb 3, 2025

looks great, and your new test encoding attrs will clean up the tests nicely!

Thank you, I'm going to merge the cleanup and update the tests. Thanks for the quick review!

@hanhanW
Copy link
Contributor Author

hanhanW commented Feb 3, 2025

looks great, and your new test encoding attrs will clean up the tests nicely!

I really like the test encoding attrs. It looks beautiful to me, see 387057d :)

@hanhanW hanhanW merged commit 78f312b into iree-org:main Feb 3, 2025
42 checks passed
@hanhanW hanhanW deleted the specialize-encodings-5-n branch February 3, 2025 17:56
ita9naiwa pushed a commit to ita9naiwa/iree that referenced this pull request Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants