-
Notifications
You must be signed in to change notification settings - Fork 796
[SYCLLowerIR] Fix a bug in hierarchical parallelism implementation #20484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCLLowerIR] Fix a bug in hierarchical parallelism implementation #20484
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Why E2E test instead of exercising the transformation directly through |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYCL changes LGTM. Edit: I've missed Andrei's comment above - please consider it before merging.
|
@intel/llvm-gatekeepers please consider merging |
Thanks, added a test case there as well. |
|
@intel/llvm-gatekeepers This should be good to merge! |
|
I'd prefer if someone reviewed the new code added since last review (@aelovikov-intel ?) |
It's outside SYCL RT scope, offload utilities should review it (IIUC). |
|
Touching this file has not added any new reviewers, nevertheless I am tagging @againull for his input on the changes made in |
|
|
||
| define internal spir_func void @foo(ptr addrspace(4) %arg, ptr byval(%struct.foo.0) align 1 %arg1) align 2 !work_group_scope !0 { | ||
| bb: | ||
| call spir_func void @baz(ptr addrspace(4) %arg, ptr byval(%struct.foo.0) align 1 %arg1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what happens if there are additional instructions in foo (in work group scope) except parallel_for_work_item call. Is this handled properly or is it a separate bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some other instructions to the foo function. No failures introduced.
When inside a
parallel_for_work_groupcontext, calls to functions that contain calls toparallel_for_work_itemare being lowered incorrectly into IR in that they are being put under the work-group leader branch in the IR which is semantically incorrect as this function should be called in every work item. This manifests when we have an indirect function call toparallel_for_work_itemtogether with at least one other direct call toparallel_for_work_itemin the sameparallel_for_work_groupcontext and it leads to a program that hangs. This PR fixes the issue and adds a couple of other tests to check this behavior.