-
Notifications
You must be signed in to change notification settings - Fork 940
Fixes to acoll component #13575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to acoll component #13575
Conversation
Adds some bug fixes to the acoll component. - gather is corrected to work for non-zero root ranks when three stage algorithm is used. - Busy waits in shared memory bcast and barrier are changed to avoid random hangs. - Overrides with command line arguments are properly taken care of in bcast algorithm selection. - Multinode path of reduce and allreduce is fixed to use the hierarchical algorithms of acoll. - Compile time warnings are removed. Signed-off-by: Nithya V S <[email protected]>
edgargabriel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor formatting things, but otherwise it looks good in my opinion.
| } | ||
|
|
||
| if (MPI_SUCCESS != err) { | ||
| if (NULL != inplacebuf_free) { free(inplacebuf_free); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move the free(inplace_buf) into a separate line (two more instances of the same issue below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| } | ||
| } | ||
|
|
||
| if (num_nodes == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the PR has a number of places where the constant is not listed at the first part of the comparison. Its most relevant for the == options, but does also apply for != and other operands (at many places you have the correct order, e.g. a few lines above at the if (MPI_SUCCESS != err)). Please double check the PR for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done here and in other places in multiple files.
Changes in formatting to adhere to ompi coding style. Signed-off-by: Nithya V S <[email protected]>
Fixes a condition for freeing temporary buffer in acoll reduce. Signed-off-by: Nithya V S <[email protected]>
amd-nithyavs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the changes along with one more fix related to a buffer free.
edgargabriel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ixes Fixes to acoll component (cherry picked from commit e47e9dd)
Adds some bug fixes to the acoll component.