[ROCm] Stop hipifying native/cudnn/ and native/quantized/cudnn/ files#3196
Open
rkayaith wants to merge 1 commit intohipdnn_developfrom
Open
[ROCm] Stop hipifying native/cudnn/ and native/quantized/cudnn/ files#3196rkayaith wants to merge 1 commit intohipdnn_developfrom
rkayaith wants to merge 1 commit intohipdnn_developfrom
Conversation
These files are guarded by `#if AT_CUDNN_ENABLED()` which is always 0 on ROCm, so only stub implementations compile. Hipify was making text substitutions (cudnnHandle_t → miopenHandle_t, etc.) to code that is entirely dead on ROCm. Include fixes needed to compile without hipify: - RNN.cpp: move CUDAEvent.h, CUDAGraphsUtils.cuh, Exceptions.h into the `#else // AT_CUDNN_ENABLED()` block (only used by real impl) - LossCTC.cpp: remove unused CUDAGraphsUtils.cuh include - BatchNorm.cpp, Module.cpp, attention.cu, attention_backward.cu: remove `#ifdef __HIP_PLATFORM_AMD__` guards that selected hipified header paths (cudnn/hip/MHA.h, cudnn/hip/BatchNorm.h) — use the originals directly since hipify no longer runs on these files The quantized/cudnn/ files additionally had redundant `#ifdef USE_CUDA` guards wrapping the entire file. These are only compiled in CUDA/ROCm builds (gated by cmake), so the guards were dead code. Authored with Claude.
This was referenced May 1, 2026
|
Jenkins build for fe50fbc0508906dcdcf0bd2b487580278c268a2d commit finished as FAILURE |
|
Jenkins build for fe50fbc0508906dcdcf0bd2b487580278c268a2d commit finished as FAILURE |
zjgarvey
reviewed
May 4, 2026
zjgarvey
left a comment
There was a problem hiding this comment.
Some questions. Definitely want an eye from @jeffdaily on some of these changes, too.
zjgarvey
approved these changes
May 6, 2026
zjgarvey
left a comment
There was a problem hiding this comment.
Looks good to me, but might be good to have @jeffdaily take a look to see if there is anything that might be problematic about the hipify changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds hipDNN as a supported backend for SDPA.
The approach taken here is to re-use the existing
CUDNN_ATTENTIONbackend, adding support for it by routing through hipDNN when compiled for ROCm. i.e.torch.nn.attention.SDPBackend.CUDNN_ATTENTIONandaten::_scaled_dot_product_cudnn_attentionnow work on ROCm.The primary change here is adding
cudnn/hip/MHA.cpp, which is a "fork" ofcudnn/MHA.cpp, modified to use hipDNN instead of cuDNN. There's various differences between the implementations that made it simpler to just fork the entire file rather than trying to keep both implementations in the same file/rely on hipify for translation:Additionally, since hipDNN provides an API for querying engine support for a graph, during backend selection
sdp::can_use_cudnn_attentioncalls the newly addedat::native::check_cudnn_sdpa_support, which constructs the hipDNN graph for both forwards and backwards (if potentially needed) to query support. This is not cached, as it's assumed construction + querying support is implemented efficiently. The sequence of hipDNN calls:Both backends still share the same
attention.cukernel (hipDNN uses the hipified version).This is separated into a stack of PRs for easier review:
PR 1/3: [ROCm] Stop hipifying native/cudnn/ and native/quantized/cudnn/ files (this PR)
ifdef'd out. This disables the hipification rules and fixes up includes/directives so the CUDA files compile cleanly on ROCM. The primary motivation here is to stop generatingcudnn/hip/MHA.cpp; the following changes add this file back with the hipDNN backend implementation.PR 2/3: [ROCm] Integrate hipDNN as an SDPA backend:
Add aotriton.images/ to .gitignore- NFC changeExtract compute_matching_strides from alloc_with_matching_layout- NFC changeSplit cudnn/MHA.cpp into separate cuDNN and hipDNN filesCopy cuDNN MHA implementation verbatim into hipDNN MHA[ROCm] Add hipDNN SDPA backend dispatchUpdate tests to reflect cudnn backend being available on ROCM.CUDNN_ATTENTIONtests can now be run on ROCm.PR 3/3: [ROCm] Pass bool attention masks directly to hipDNN
Open questions for reviewers:
CUDNN_ATTENTIONbackend was done due to API similarities, though this could potentially be confusing. Would it be preferrable to completely separate hipDNN by adding a new backend?MHA.cppfiles be recommended?