Update THD sink attention logic for cudnn >=9.18.0 #2561

cuichenx · 2026-01-05T19:10:51Z

THD Sink attention is supported in 9.18.0

Description

Please include a brief summary of the changes, relevant motivation and context.

Fixes # (issue)

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Please list the changes introduced in this PR:

Change A
Change B

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

THD Sink attention is supported in 9.18.0 Signed-off-by: Chen Cui <[email protected]>

for more information, see https://pre-commit.ci

greptile-apps · 2026-01-05T19:14:28Z

Greptile Summary

This PR updates the attention backend selection logic to enable FusedAttention for THD (Token-Head-Dimension) format with non-vanilla softmax types when using cuDNN 9.18.0 or later. Previously, FusedAttention was unconditionally disabled for all THD formats with non-vanilla softmax. The change adds a version check that only disables FusedAttention for cuDNN versions below 9.18.0, allowing modern cuDNN versions to leverage the newly supported sink attention feature in THD format.

The change is minimal and focused: it wraps the FusedAttention disabling logic in a version check, while keeping UnfusedDotProductAttention disabled for all versions to maintain backward compatibility with older cuDNN versions that lack this feature.

Confidence Score: 5/5

This PR is safe to merge with no concerns. The change is straightforward version-gated logic that enables an existing feature for newer cuDNN versions.
The change introduces a simple, version-gated condition (cudnn_version < (9, 18, 0)) that modifies behavior based on cuDNN capabilities. The logic is correct: it allows FusedAttention to be used for THD format with non-vanilla softmax when cuDNN 9.18.0+ is available, which is the stated feature addition. The version check pattern is consistent with other version checks in the same file (e.g., lines 491, 497). The change is backward compatible as it only enables new functionality for newer versions, while older versions continue to have FusedAttention disabled as before. No new bugs are introduced, and the scope is minimal (single conditional block).
No files require special attention

Important Files Changed

Filename	Overview
transformer_engine/pytorch/attention/dot_product_attention/utils.py	Updated THD sink attention logic to conditionally disable FusedAttention only when cuDNN version is below 9.18.0. This allows newer cuDNN versions to use FusedAttention with THD format and non-vanilla softmax types, which is now supported natively.

Sequence Diagram

sequenceDiagram
    participant get_attention_backend as get_attention_backend()
    participant version_check as cudnn_version check
    participant fused_attn as FusedAttention

    get_attention_backend->>version_check: Check if softmax_type != "vanilla"
    version_check-->>get_attention_backend: True
    
    get_attention_backend->>version_check: Check if qkv_format == "thd"
    version_check-->>get_attention_backend: True
    
    rect rgb(200, 220, 255)
    Note over version_check: NEW: Version gate added
    get_attention_backend->>version_check: Check if cudnn_version < (9, 18, 0)
    end
    
    alt cuDNN < 9.18.0
        version_check-->>fused_attn: Disable FusedAttention (legacy behavior)
    else cuDNN >= 9.18.0
        version_check-->>fused_attn: Allow FusedAttention (new feature support)
    end

greptile-apps · 2026-01-05T19:14:28Z

Greptile found no issues!

From now on, if a review finishes and we haven't found any issues, we will not post anything, but you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

cyanguwa · 2026-01-06T11:40:54Z

Could you please add a THD test here: https://github.com/cuichenx/TransformerEngine/blob/442699c714c3e25d1797712319e32f4d569a98e5/tests/pytorch/attention/test_attention.py#L418

Thanks!

cuichenx · 2026-01-07T00:03:33Z

accidentally closed this after renaming the branch.. opened new PR here #2568

cuichenx and others added 2 commits January 5, 2026 11:10

Update THD sink attention logic for newer cudnn versions

fbb0702

THD Sink attention is supported in 9.18.0 Signed-off-by: Chen Cui <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

01848c0

for more information, see https://pre-commit.ci

Merge branch 'main' into patch-1

442699c

cuichenx closed this Jan 6, 2026

cuichenx deleted the patch-1 branch January 6, 2026 23:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update THD sink attention logic for cudnn >=9.18.0 #2561

Update THD sink attention logic for cudnn >=9.18.0 #2561

Uh oh!

cuichenx commented Jan 5, 2026

Uh oh!

greptile-apps bot commented Jan 5, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Jan 5, 2026

Uh oh!

cyanguwa commented Jan 6, 2026

Uh oh!

cuichenx commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update THD sink attention logic for cudnn >=9.18.0 #2561

Update THD sink attention logic for cudnn >=9.18.0 #2561

Uh oh!

Conversation

cuichenx commented Jan 5, 2026

Description

Type of change

Changes

Checklist:

Uh oh!

greptile-apps bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot commented Jan 5, 2026

Greptile found no issues!

Uh oh!

cyanguwa commented Jan 6, 2026

Uh oh!

cuichenx commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps bot commented Jan 5, 2026 •

edited

Loading