Skip to content

Conversation

@amd-nithyavs
Copy link
Contributor

@amd-nithyavs amd-nithyavs commented Jan 6, 2026

Adds some bug fixes to the acoll component.

  • gather is corrected to work for non-zero root ranks when three stage algorithm is used.
  • Busy waits in shared memory bcast and barrier are changed to avoid random hangs.
  • Overrides with command line arguments are properly taken care of in bcast algorithm selection.
  • Multinode path of reduce and allreduce is fixed to use the hierarchical algorithms of acoll.
  • Compile time warnings are removed.

Cherry-picked from #13575

@github-actions github-actions bot added this to the v6.0.0 milestone Jan 6, 2026
@github-actions
Copy link

github-actions bot commented Jan 6, 2026

Hello! The Git Commit Checker CI bot found a few problems with this PR:

efd85b5: Merge pull request #13575 from amd-nithyavs/16Dec2...

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not cherry pick merge commits.

@jsquyres jsquyres changed the title coll/acoll: Fixes to acoll component v6.0.x: coll/acoll: Fixes to acoll component Jan 6, 2026
@github-actions
Copy link

github-actions bot commented Jan 6, 2026

Hello! The Git Commit Checker CI bot found a few problems with this PR:

efd85b5: Merge pull request #13575 from amd-nithyavs/16Dec2...

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Adds some bug fixes to the acoll component.
- gather is corrected to work for non-zero root ranks when three stage
  algorithm is used.
- Busy waits in shared memory bcast and barrier are changed to avoid
  random hangs.
- Overrides with command line arguments are properly taken care of in
  bcast algorithm selection.
- Multinode path of reduce and allreduce is fixed to use the
  hierarchical algorithms of acoll.
- Compile time warnings are removed.

Signed-off-by: Nithya V S <[email protected]>
(cherry picked from commit 12f6838)
Changes in formatting to adhere to ompi coding style.

Signed-off-by: Nithya V S <[email protected]>
(cherry picked from commit a3d2d4a)
Fixes a condition for freeing temporary buffer in acoll reduce.

Signed-off-by: Nithya V S <[email protected]>
(cherry picked from commit 6838054)
@amd-nithyavs amd-nithyavs force-pushed the 6Jan2026_6.0_acoll_fixes branch from efd85b5 to a037a7e Compare January 6, 2026 16:52
@amd-nithyavs
Copy link
Contributor Author

Please do not cherry pick merge commits.

Cherry-picked from the PR. Please let me know if it is fine now.

@jsquyres jsquyres dismissed their stale review January 6, 2026 22:04

Thanks! I'll dismiss my NACK, and let others who are familiar with acoll do the actual review.

@hppritcha hppritcha merged commit 14b185a into open-mpi:v6.0.x Jan 8, 2026
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants