-
Notifications
You must be signed in to change notification settings - Fork 337
Intra-node shared memory (SHM) optimizations for CPU primitives #458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gaopengff
wants to merge
17
commits into
pytorch:main
Choose a base branch
from
gaopengff:gaopengf/shm
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+624
−2
Open
Changes from 13 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
1738e8e
add shm allreduce
gaopengff 76d1114
add bf16 and half support
gaopengff 2d152a3
remove bf16 support
gaopengff 8c29eeb
add bf16 support
gaopengff 554d317
use reduce function to do reduce job
gaopengff 0fdde35
refine format
gaopengff be7da7c
fix accuracy issue
gaopengff 5b698dc
move intro-node check to allreduce()
gaopengff 3564e95
remove debug code
gaopengff 3ac8065
use local_rank to check intra-node condition
gaopengff 8d3ae22
add support for multi-thread and code for local reduction
gaopengff 2985c3e
move intra-node check to createAndConnectAllPairs
gaopengff 704d35d
update to main
gaopengff 9ae2842
add check for gpu input and fix format issue
gaopengff 1b7660a
fix format issue
gaopengff ab9c63b
fix timeout ut
gaopengff 9f726e9
add fininalize method and fix ut
gaopengff File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a check to make sure inputs are on CPU before setting it to shm? I believe this would fail when using local ibv cuda backend
Can also check
device->hasGPUDirect()
and instead disable for all GPU direct supported backendsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I've changed the check condition to "(context->isIntraNode() && !context->getDevice()->hasGPUDirect())".