Skip to content

Conversation

@lbushi25
Copy link
Contributor

@lbushi25 lbushi25 commented Oct 27, 2025

When inside a parallel_for_work_group context, calls to functions that contain calls to parallel_for_work_item are being lowered incorrectly into IR in that they are being put under the work-group leader branch in the IR which is semantically incorrect as this function should be called in every work item. This manifests when we have an indirect function call to parallel_for_work_item together with at least one other direct call to parallel_for_work_item in the same parallel_for_work_group context and it leads to a program that hangs. This PR fixes the issue and adds a couple of other tests to check this behavior.

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aelovikov-intel
Copy link
Contributor

Why E2E test instead of exercising the transformation directly through opt?

Copy link
Contributor

@slawekptak slawekptak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCL changes LGTM. Edit: I've missed Andrei's comment above - please consider it before merging.

@github-actions
Copy link
Contributor

@intel/llvm-gatekeepers please consider merging

@lbushi25
Copy link
Contributor Author

Why E2E test instead of exercising the transformation directly through opt?

Thanks, added a test case there as well.

@lbushi25
Copy link
Contributor Author

@intel/llvm-gatekeepers This should be good to merge!

@sarnex
Copy link
Contributor

sarnex commented Oct 28, 2025

I'd prefer if someone reviewed the new code added since last review (@aelovikov-intel ?)

@aelovikov-intel
Copy link
Contributor

I'd prefer if someone reviewed the new code added since last review (@aelovikov-intel ?)

It's outside SYCL RT scope, offload utilities should review it (IIUC).

@lbushi25
Copy link
Contributor Author

Touching this file has not added any new reviewers, nevertheless I am tagging @againull for his input on the changes made in pfwg_and_pfwi.ll as he seems to have created this file.


define internal spir_func void @foo(ptr addrspace(4) %arg, ptr byval(%struct.foo.0) align 1 %arg1) align 2 !work_group_scope !0 {
bb:
call spir_func void @baz(ptr addrspace(4) %arg, ptr byval(%struct.foo.0) align 1 %arg1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what happens if there are additional instructions in foo (in work group scope) except parallel_for_work_item call. Is this handled properly or is it a separate bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some other instructions to the foo function. No failures introduced.

@againull againull merged commit 4f7b179 into intel:sycl Oct 28, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants