[Stream] Update executable functions in encoding specialization pass. #19700

hanhanW · 2025-01-14T18:18:18Z

For the stream bindings, the encoding types of duplicated executables need to be updated with resolved layouts. In the revision, it is done by type interface. The revision introduces EncodingTypeInterface in the IREE::Encoding dialect. There are two interface methods:

getEncodingType: returns the tensor type with the encoding. E.g., the bounded type is returned in the Flow::DispatchTensorType implementation.
updateEncoding: returns the same type but with the new encoding. E.g., the encoding in the bounded type is updated to the new encoding in the Flow::DispatchTensorType implementation.

The revision implements the interface methods for
Flow::DispatchTensorType and uses them to update the bindings.

Codegen already looks at flow.dispatch.tensor.load/store and will have the incoming layout information available. If there is a layout transferring need, codegen should be able to generate relayout ops when they materialize the encodings. E.g., the incoming layout is attached and codegen knows the target layout. It can generate relayout ops to bring the incoming layout to the target layout.

#encoding3 =  ...operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  layouts = [#iree_cpu.cpu_encoding_layout<configuration = {
    encoding_info = {innerDimsPos = [0, 1],
                     innerTileSizes = [16, 1],
                     outerDimsPerm = [0, 1]}}>]
>
#encoding6 =  ...operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  user_indexing_maps = [#map, #map1, #map2]
>
%7 = flow.dispatch.tensor.load %4,
  offsets = [0, 0], sizes = [%2, %0], strides = [1, 1]
  : !flow.dispatch.tensor<readonly:tensor<?x?xf32, #encoding3>>{%2, %0}
  -> tensor<?x?xf32, #encoding6>

hanhanW · 2025-01-14T18:18:53Z

It depends on #19527, and it is ready for review.

hanhanW · 2025-01-15T07:26:28Z

cc @bjacob @Max191 Heads-up. In the future, I expect to see something like below. In my encoding specialization pass, we resolve layouts and attach them to bindings. I think here would be where the layout transfer happens. If the request layout is different (i.e., the load shape is different from the source shape), we'll need to generate relayout ops in codegen.

  operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  layouts = [#iree_cpu.cpu_encoding_layout<configuration = {
    encoding_info = {innerDimsPos = [0, 1],
                     innerTileSizes = [16, 1],
                     outerDimsPerm = [0, 1]}}>]
>
  operand_index = 0 : index, op_type =  matmul, element_types = [f32, f32, f32],
  user_indexing_maps = [#map, #map1, #map2]
>
%7 = flow.dispatch.tensor.load %4,
  offsets = [0, 0], sizes = [%2, %0], strides = [1, 1]
  : !flow.dispatch.tensor<readonly:tensor<?x?xf32, #encoding3>>{%2, %0}
  -> tensor<?x?xf32, #encoding6>

The duplicated executables need to update the encoding types with resolved layouts for those stream bindings. In the revision, it is done by type interface. The revision introduces EncodingTypeInterface in the IREE::Encoding dialect. There are two interface methods: - getEncodingType: returns the tensor type with the encoding. E.g., the bounded type is returned in the Flow::DispatchTensorType implementation. - updateEncoding: returns the same type but with the new encoding. E.g., the encoding in the bounded type is updated to the new encoding in the Flow::DispatchTensorType implementation. The revision implements the interface methods for Flow::DispatchTensorType and uses them to update the bindings. Codegen already looks at flow.dispatch.tensor.load/store and will have the incoming layout information available. If there is a layout transferring need, codegen should be able to generate relayout ops when they materialize the encodings. E.g., the incoming layout is attached and codegen knows the target layout. It can generate relayout ops to bring the incoming layout to the target layout. ``` operand_index = 0 : index, op_type = matmul, element_types = [f32, f32, f32], layouts = [#iree_cpu.cpu_encoding_layout<configuration = { encoding_info = {innerDimsPos = [0, 1], innerTileSizes = [16, 1], outerDimsPerm = [0, 1]}}>] > operand_index = 0 : index, op_type = matmul, element_types = [f32, f32, f32], user_indexing_maps = [#map, #map1, #map2] > %7 = flow.dispatch.tensor.load %4, offsets = [0, 0], sizes = [%2, %0], strides = [1, 1] : !flow.dispatch.tensor<readonly:tensor<?x?xf32, #encoding3>>{%2, %0} -> tensor<?x?xf32, #encoding6> ``` Signed-off-by: hanhanW <[email protected]>

hanhanW · 2025-02-03T07:06:21Z

@benvanik this is ready for review. In the meantime, I'm working on the testing encoding attribute, so we do not need to rely on VMVX/CPU ones.

hanhanW · 2025-02-03T09:13:11Z

@benvanik this is ready for review. In the meantime, I'm working on the testing encoding attribute, so we do not need to rely on VMVX/CPU ones.

#19879 is up for review. I'll rebase and update the tests once it's landed.

benvanik

looks great, and your new test encoding attrs will clean up the tests nicely!

hanhanW · 2025-02-03T16:37:34Z

looks great, and your new test encoding attrs will clean up the tests nicely!

Thank you, I'm going to merge the cleanup and update the tests. Thanks for the quick review!

compiler/src/iree/compiler/Dialect/Encoding/IR/EncodingInterfaces.td

Signed-off-by: hanhanW <[email protected]>

hanhanW · 2025-02-03T17:34:00Z

looks great, and your new test encoding attrs will clean up the tests nicely!

I really like the test encoding attrs. It looks beautiful to me, see 387057d :)

…iree-org#19700) Signed-off-by: Hyunsung Lee <[email protected]>

hanhanW requested review from MaheshRavishankar, IanWood1, bjacob and benvanik as code owners January 14, 2025 18:18

hanhanW removed request for bjacob and IanWood1 January 14, 2025 18:18

hanhanW force-pushed the specialize-encodings-5-n branch from a1e7d8f to bc4cbc0 Compare January 14, 2025 18:30

hanhanW force-pushed the users/hanhanW/specialize-encodings-2-n branch from c2756a0 to d80e377 Compare January 15, 2025 08:28

hanhanW force-pushed the specialize-encodings-5-n branch from bc4cbc0 to 098e173 Compare January 15, 2025 08:28

hanhanW marked this pull request as draft January 24, 2025 13:45

hanhanW changed the base branch from users/hanhanW/specialize-encodings-2-n to users/hanhanW/specialize-encodings-dup-2 January 24, 2025 15:03

hanhanW force-pushed the specialize-encodings-5-n branch from c5e737b to fa69c58 Compare January 24, 2025 15:04

hanhanW mentioned this pull request Jan 28, 2025

Add support for executable duplication in encoding specialization pass. #19803

Merged

hanhanW force-pushed the specialize-encodings-5-n branch 3 times, most recently from 9c827e1 to ec2be7c Compare February 3, 2025 06:50

hanhanW changed the base branch from users/hanhanW/specialize-encodings-dup-2 to main February 3, 2025 06:51

hanhanW force-pushed the specialize-encodings-5-n branch from ec2be7c to 2c25a19 Compare February 3, 2025 07:01

hanhanW marked this pull request as ready for review February 3, 2025 07:03

hanhanW requested a review from kuhar February 3, 2025 09:12

benvanik approved these changes Feb 3, 2025

View reviewed changes

Merge branch 'main' into specialize-encodings-5-n

5fd4f6e

kuhar reviewed Feb 3, 2025

View reviewed changes

compiler/src/iree/compiler/Dialect/Encoding/IR/EncodingInterfaces.td Show resolved Hide resolved

compiler/src/iree/compiler/Dialect/Encoding/IR/EncodingInterfaces.td Show resolved Hide resolved

kuhar approved these changes Feb 3, 2025

View reviewed changes

update tests to use unspecialized_encoding

387057d

Signed-off-by: hanhanW <[email protected]>

hanhanW merged commit 78f312b into iree-org:main Feb 3, 2025
42 checks passed

hanhanW deleted the specialize-encodings-5-n branch February 3, 2025 17:56

ita9naiwa pushed a commit to ita9naiwa/iree that referenced this pull request Feb 4, 2025

[Stream] Update executable functions in encoding specialization pass. (…

9199a54

…iree-org#19700) Signed-off-by: Hyunsung Lee <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stream] Update executable functions in encoding specialization pass. #19700

[Stream] Update executable functions in encoding specialization pass. #19700

hanhanW commented Jan 14, 2025 •

edited

Loading

hanhanW commented Jan 14, 2025

hanhanW commented Jan 15, 2025

hanhanW commented Feb 3, 2025

hanhanW commented Feb 3, 2025

benvanik left a comment

hanhanW commented Feb 3, 2025

hanhanW commented Feb 3, 2025

[Stream] Update executable functions in encoding specialization pass. #19700

[Stream] Update executable functions in encoding specialization pass. #19700

Conversation

hanhanW commented Jan 14, 2025 • edited Loading

hanhanW commented Jan 14, 2025

hanhanW commented Jan 15, 2025

hanhanW commented Feb 3, 2025

hanhanW commented Feb 3, 2025

benvanik left a comment

Choose a reason for hiding this comment

hanhanW commented Feb 3, 2025

hanhanW commented Feb 3, 2025

hanhanW commented Jan 14, 2025 •

edited

Loading