-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Stream] Update executable functions in encoding specialization pass. #19700
Conversation
It depends on #19527, and it is ready for review. |
a1e7d8f
to
bc4cbc0
Compare
cc @bjacob @Max191 Heads-up. In the future, I expect to see something like below. In my encoding specialization pass, we resolve layouts and attach them to bindings. I think here would be where the layout transfer happens. If the request layout is different (i.e., the load shape is different from the source shape), we'll need to generate relayout ops in codegen.
|
c2756a0
to
d80e377
Compare
bc4cbc0
to
098e173
Compare
c5e737b
to
fa69c58
Compare
9c827e1
to
ec2be7c
Compare
The duplicated executables need to update the encoding types with resolved layouts for those stream bindings. In the revision, it is done by type interface. The revision introduces EncodingTypeInterface in the IREE::Encoding dialect. There are two interface methods: - getEncodingType: returns the tensor type with the encoding. E.g., the bounded type is returned in the Flow::DispatchTensorType implementation. - updateEncoding: returns the same type but with the new encoding. E.g., the encoding in the bounded type is updated to the new encoding in the Flow::DispatchTensorType implementation. The revision implements the interface methods for Flow::DispatchTensorType and uses them to update the bindings. Codegen already looks at flow.dispatch.tensor.load/store and will have the incoming layout information available. If there is a layout transferring need, codegen should be able to generate relayout ops when they materialize the encodings. E.g., the incoming layout is attached and codegen knows the target layout. It can generate relayout ops to bring the incoming layout to the target layout. ``` operand_index = 0 : index, op_type = matmul, element_types = [f32, f32, f32], layouts = [#iree_cpu.cpu_encoding_layout<configuration = { encoding_info = {innerDimsPos = [0, 1], innerTileSizes = [16, 1], outerDimsPerm = [0, 1]}}>] > operand_index = 0 : index, op_type = matmul, element_types = [f32, f32, f32], user_indexing_maps = [#map, #map1, #map2] > %7 = flow.dispatch.tensor.load %4, offsets = [0, 0], sizes = [%2, %0], strides = [1, 1] : !flow.dispatch.tensor<readonly:tensor<?x?xf32, #encoding3>>{%2, %0} -> tensor<?x?xf32, #encoding6> ``` Signed-off-by: hanhanW <[email protected]>
ec2be7c
to
2c25a19
Compare
@benvanik this is ready for review. In the meantime, I'm working on the testing encoding attribute, so we do not need to rely on VMVX/CPU ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great, and your new test encoding attrs will clean up the tests nicely!
Thank you, I'm going to merge the cleanup and update the tests. Thanks for the quick review! |
Signed-off-by: hanhanW <[email protected]>
I really like the test encoding attrs. It looks beautiful to me, see 387057d :) |
…iree-org#19700) Signed-off-by: Hyunsung Lee <[email protected]>
For the stream bindings, the encoding types of duplicated executables need to be updated with resolved layouts. In the revision, it is done by type interface. The revision introduces
EncodingTypeInterface
in the IREE::Encoding dialect. There are two interface methods:The revision implements the interface methods for
Flow::DispatchTensorType and uses them to update the bindings.
Codegen already looks at flow.dispatch.tensor.load/store and will have the incoming layout information available. If there is a layout transferring need, codegen should be able to generate relayout ops when they materialize the encodings. E.g., the incoming layout is attached and codegen knows the target layout. It can generate relayout ops to bring the incoming layout to the target layout.