Skip to content

Conversation

@shadidashmiz
Copy link
Contributor

@shadidashmiz shadidashmiz commented Jan 8, 2026

  • Add maybe undef attribute to shfl synch to avoid compiler failure of un initialized value

Motivation

Make shfl sync compatible with cuda

Technical Details

Add the attribute to make the shfl sync work with uninitialized variables.

JIRA ID

SWDEV-573004

Test Plan

Hip tests

Test Result

PSDB test results

Submission Checklist

Copilot AI review requested due to automatic review settings January 8, 2026 16:08
@shadidashmiz shadidashmiz requested review from a team as code owners January 8, 2026 16:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds the MAYBE_UNDEF attribute to the var parameter in all four shfl_sync function variants (__shfl_sync, __shfl_up_sync, __shfl_down_sync, and __shfl_xor_sync) to fix compiler initialization value handling. The maybe_undef attribute indicates to the compiler that it's acceptable for these parameters to be uninitialized, which is appropriate for warp shuffle operations where lanes may use values from other lanes.

Key changes:

  • Added MAYBE_UNDEF attribute to the var parameter in all four template shfl_sync functions
  • Changes align with the pattern already used in non-sync warp functions in amd_warp_functions.h

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shadidashmiz shadidashmiz force-pushed the amd/sdashmiz/SWDEV-573004 branch 2 times, most recently from 4ba1d1d to 293baf3 Compare January 13, 2026 19:34
- add attribute for maybe undef

Signed-off-by: sdashmiz <[email protected]>
@shadidashmiz shadidashmiz force-pushed the amd/sdashmiz/SWDEV-573004 branch from 524f658 to 212b769 Compare January 16, 2026 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants