Add a custom `gpu-module-to-binary` pass. #525

Hardcode84 · 2025-12-03T23:24:06Z

This is mostly copypasted from the upstream gpu-module-to-binary with non-essential parts stripped down (like other vendors support, hehe).

The one improvement over the upstream pass is the ability to dump and override LLVM IR and assembly, for debugging purposes.

Copilot

Pull request overview

This PR adds a custom gpu-module-to-binary pass for Water/Wave that compiles GPU modules to HSACO binaries. The implementation is based on MLIR's upstream gpu-module-to-binary pass but streamlined to support only ROCDL/AMD targets. A key enhancement is the ability to dump and override intermediate compilation artifacts (LLVM IR, assembly) for debugging purposes.

Key changes:

Implements WaterGPUModuleToBinary pass that translates GPU modules to LLVM IR, optimizes them, compiles to ISA, and assembles to HSACO binaries
Adds assembleISAToHSACO utility function that uses LLVM MC infrastructure and ld.lld to produce HSACO files
Provides dump-intermediates and override-intermediates options for debugging compilation stages

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
water/lib/Transforms/GPUModuleToBinary.cpp	Main pass implementation handling GPU module serialization through multiple stages (LLVM IR translation, linking, optimization, ISA compilation, and HSACO assembly)
water/lib/Transforms/AssembleISA.cpp	Assembly utilities for converting ISA to HSACO using LLVM MC infrastructure and ld.lld linker
water/lib/Transforms/AssembleISA.h	Header defining assembly interface and AMDGPU target initialization
water/include/water/Transforms/Passes.td	Pass definition with options for toolkit path, bitcode linking, and intermediate file dumping/overriding
water/lib/Transforms/CMakeLists.txt	Build configuration adding new source files and MLIRROCDLTarget library dependency
water/tools/water-opt/water-opt.cpp	Registration of LLVM conversion interfaces, GPU passes, and translation modules needed for the new pass
water/tools/water-opt/CMakeLists.txt	Added LLVM IR translation library dependencies
water/test/Transforms/gpu-module-to-binary.mlir	Basic test verifying GPU module to binary conversion
water/test/Transforms/gpu-module-to-binary-dump.mlir	Test validating intermediate file dumping functionality
water/test/Transforms/gpu-module-to-binary-override.mlir	Test demonstrating intermediate file override capability for debugging

Comments suppressed due to low confidence (3)

water/lib/Transforms/CMakeLists.txt:29

Missing library dependency: GPUModuleToBinary.cpp calls translateModuleToLLVMIR which requires the MLIRTargetLLVMIRExport library. This should be added to the LINK_LIBS PUBLIC section to ensure proper linking.

  LINK_LIBS PUBLIC
  MLIRAnalysis
  MLIRArithDialect
  MLIRControlFlowDialect
  MLIRFuncDialect
  MLIRGPUDialect
  MLIRIR
  MLIRLLVMDialect
  MLIRMemRefDialect
  MLIRPass
  MLIRROCDLTarget
  MLIRRewrite
  MLIRTransformUtils
  MLIRVectorDialect
  MLIRWaterAnalysis

water/lib/Transforms/CMakeLists.txt:29

Missing library dependency: GPUModuleToBinary.cpp uses makeOptimizingTransformer from mlir/ExecutionEngine/OptUtils.h, which requires the MLIRExecutionEngine library. This should be added to the LINK_LIBS PUBLIC section.

  LINK_LIBS PUBLIC
  MLIRAnalysis
  MLIRArithDialect
  MLIRControlFlowDialect
  MLIRFuncDialect
  MLIRGPUDialect
  MLIRIR
  MLIRLLVMDialect
  MLIRMemRefDialect
  MLIRPass
  MLIRROCDLTarget
  MLIRRewrite
  MLIRTransformUtils
  MLIRVectorDialect
  MLIRWaterAnalysis

water/lib/Transforms/CMakeLists.txt:29

Missing library dependency: GPUModuleToBinary.cpp uses ROCDL::ROCDLTargetAttr and ROCDL::getROCMPath() which require the MLIRROCDLDialect library. This should be added to the LINK_LIBS PUBLIC section.

  LINK_LIBS PUBLIC
  MLIRAnalysis
  MLIRArithDialect
  MLIRControlFlowDialect
  MLIRFuncDialect
  MLIRGPUDialect
  MLIRIR
  MLIRLLVMDialect
  MLIRMemRefDialect
  MLIRPass
  MLIRROCDLTarget
  MLIRRewrite
  MLIRTransformUtils
  MLIRVectorDialect
  MLIRWaterAnalysis

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

water/lib/Transforms/AssembleISA.cpp

water/lib/Transforms/GPUModuleToBinary.cpp

water/lib/Transforms/AssembleISA.cpp

water/test/Transforms/gpu-module-to-binary-override.mlir

water/lib/Transforms/GPUModuleToBinary.cpp

ftynse · 2025-12-04T07:35:55Z

Is the ability to inject a different assembly something we should rather put upstream? @fabianmcg may be interested and he created the current design for this pass.

tgymnich · 2025-12-04T09:25:08Z

water/lib/Transforms/AssembleISA.cpp

+  (void)initialized;
+}
+
+FailureOr<SmallVector<char, 0>>


The preferred data type for raw memory is unsigned char or std::byte.
edit: I now see it used everywhere like this in the MLIR codebase. I don't understand why.

Using unsigned char or std::byte over char is done to disambiguate strings which are null terminated from raw bytes, mostly to avoid their accidental use in C string apis.

In this case it might be worth disambiguating because hsaco has both a serialized and a text representation. Which I would expect to be const char* and unsigned char* respectively.

Yeah, all llvm classes like MemoryBuffer are using char*.

tgymnich · 2025-12-04T09:28:50Z

water/lib/Transforms/AssembleISA.cpp

+    return op->emitError("Failed to read HSACO from temporary file");
+
+  StringRef buffer = (*hsacoFile)->getBuffer();
+  return SmallVector<char, 0>(buffer.begin(), buffer.end());


Suggested change

return SmallVector<char, 0>(buffer.begin(), buffer.end());

return SmallVector<unsigned char, 0>(buffer.bytes_begin(), buffer.bytes_end());

See the prev comment, I can it, but it will be a lot of pointer casts.

Hardcode84 · 2025-12-04T09:56:36Z

Is the ability to inject a different assembly something we should rather put upstream?

Maybe in a month (or three), after we bikeshed callback return types enough llvm/llvm-project#170134

fabianmcg · 2025-12-04T10:44:30Z

FWIW, it's been in my TODO list for a while to break gpu-module-to-binary, into gpu-compile-module-to and gpu-compile-binary-to (allowing to resume/stop compilation at different representations). Which would also essentially allow what you want, however, it's not being in my critical path, and I don't think it will be there before the end of the year.

From @Hardcode84 original PR upstream, there seemed no need to modify LLVM or assembly, my issue with the PR is the signal of an error inside a callback without propagation.

Hardcode84 · 2025-12-04T10:51:18Z

From @Hardcode84 original PR upstream, there seemed no need to modify LLVM or assembly, my issue with the PR is the signal of an error inside a callback without propagation.

I went upstream with just dump first as most non-controversial, and even it got stuck (see PR on callbacks)

Signed-off-by: Ivan Butygin <[email protected]>

ftynse · 2025-12-05T08:52:16Z

I went upstream with just dump first as most non-controversial, and even it got stuck (see PR on callbacks)

You shouldn't need errors if you just dump though :) It becomes controversial because people pick up on what you actually intend to do.

Hardcode84 · 2025-12-05T12:28:23Z

Anyway, lets merge this one for now?

Hardcode84 · 2025-12-08T09:48:41Z

ping. This one is actually blocking a few other things as there is no way to dump asm in the new pipeline.

ftynse

Let's have it, but remove it by end of January after making upstream to work properly. Having a copy of the code is a long-term maintenance liability.

fabianmcg · 2025-12-08T17:36:22Z

FWIW if you only care about dumping the ASM to check it, you can use debug-only=serialize-to-isa https://github.com/llvm/llvm-project/blob/main/mlir/lib/Target/LLVM/ROCDL/Target.cpp#L431-L436

Hardcode84 · 2025-12-08T18:25:12Z

debug-only only works if you build in debug/with assertions (which we do currently, but still)

Hardcode84 requested review from Copilot, ftynse and tgymnich December 3, 2025 23:50

Copilot started reviewing on behalf of Hardcode84 December 3, 2025 23:51 View session

Copilot finished reviewing on behalf of Hardcode84 December 3, 2025 23:54

Copilot AI reviewed Dec 3, 2025

View reviewed changes

tgymnich reviewed Dec 4, 2025

View reviewed changes

Hardcode84 added 18 commits December 4, 2025 19:06

pass stub

aa42d43

Signed-off-by: Ivan Butygin <[email protected]>

WIP

0d9db3e

Signed-off-by: Ivan Butygin <[email protected]>

wip

f8c72ba

Signed-off-by: Ivan Butygin <[email protected]>

style

2c68a85

Signed-off-by: Ivan Butygin <[email protected]>

optimization

794d01a

Signed-off-by: Ivan Butygin <[email protected]>

opt level

84bdcbc

Signed-off-by: Ivan Butygin <[email protected]>

rename

f8b4250

Signed-off-by: Ivan Butygin <[email protected]>

ISA

3589b95

Signed-off-by: Ivan Butygin <[email protected]>

FailureOr

cf6f17a

Signed-off-by: Ivan Butygin <[email protected]>

HSACO

5e59c43

Signed-off-by: Ivan Butygin <[email protected]>

HSACO 2

13af7e9

Signed-off-by: Ivan Butygin <[email protected]>

init target

9a402dd

Signed-off-by: Ivan Butygin <[email protected]>

clenaup

287493c

Signed-off-by: Ivan Butygin <[email protected]>

dump-intermediates

ea43fba

Signed-off-by: Ivan Butygin <[email protected]>

dump hsaco

aaa4eaa

Signed-off-by: Ivan Butygin <[email protected]>

renamings

4154ee9

Signed-off-by: Ivan Butygin <[email protected]>

override

65a2256

Signed-off-by: Ivan Butygin <[email protected]>

test

d2e9e66

Signed-off-by: Ivan Butygin <[email protected]>

Hardcode84 added 8 commits December 4, 2025 19:06

create dump dir

7bc1338

Signed-off-by: Ivan Butygin <[email protected]>

cleanup

df0f21c

Signed-off-by: Ivan Butygin <[email protected]>

cleanup

421a360

Signed-off-by: Ivan Butygin <[email protected]>

override hsaco

91f8555

Signed-off-by: Ivan Butygin <[email protected]>

update test

e00427e

Signed-off-by: Ivan Butygin <[email protected]>

add lib

9b27a3f

Signed-off-by: Ivan Butygin <[email protected]>

cleanup

78160ae

Signed-off-by: Ivan Butygin <[email protected]>

fixes

7f134f6

Signed-off-by: Ivan Butygin <[email protected]>

Hardcode84 force-pushed the water-dump-intermediates-clear branch from 1228cce to 7f134f6 Compare December 4, 2025 18:12

ftynse approved these changes Dec 8, 2025

View reviewed changes

Hardcode84 merged commit 28c5473 into iree-org:main Dec 8, 2025
16 checks passed

Hardcode84 deleted the water-dump-intermediates-clear branch December 8, 2025 18:25

	return SmallVector<char, 0>(buffer.begin(), buffer.end());
	return SmallVector<unsigned char, 0>(buffer.bytes_begin(), buffer.bytes_end());

Add a custom gpu-module-to-binary pass. #525

Add a custom gpu-module-to-binary pass. #525

Uh oh!

Conversation

Hardcode84 commented Dec 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ftynse commented Dec 4, 2025

Uh oh!

tgymnich Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Hardcode84 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

tgymnich Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Hardcode84 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Hardcode84 commented Dec 4, 2025

Uh oh!

fabianmcg commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Hardcode84 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ftynse commented Dec 5, 2025

Uh oh!

Hardcode84 commented Dec 5, 2025

Uh oh!

Hardcode84 commented Dec 8, 2025

Uh oh!

ftynse left a comment

Choose a reason for hiding this comment

Uh oh!

fabianmcg commented Dec 8, 2025

Uh oh!

Hardcode84 commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add a custom `gpu-module-to-binary` pass. #525

Add a custom `gpu-module-to-binary` pass. #525

tgymnich Dec 4, 2025 •

edited

Loading

fabianmcg commented Dec 4, 2025 •

edited

Loading

Hardcode84 commented Dec 4, 2025 •

edited

Loading