Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xuncai/all gather fixes #17155

Merged
merged 2 commits into from
Jan 31, 2025
Merged

Xuncai/all gather fixes #17155

merged 2 commits into from
Jan 31, 2025

Conversation

caixunshiren
Copy link
Contributor

@caixunshiren caixunshiren commented Jan 27, 2025

Ticket

This PR fixes two issues exposed when integrating ccl async to TG llama:

  • Program caching does not handle input tensor specs properly
  • All gather does not handle subdevice properly

Checklist

@caixunshiren caixunshiren force-pushed the xuncai/all-gather-fixes branch from dfc4c8b to e5564d4 Compare January 30, 2025 19:04
Copy link
Contributor

@SeanNijjar SeanNijjar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if there are any tests you can add along with the PR? Is there one from model team that is usable?

@caixunshiren
Copy link
Contributor Author

Do you know if there are any tests you can add along with the PR? Is there one from model team that is usable?

I think it would be better if I add the test along with my CCL minimal pr? So that these specific CCL shapes can be added.

@caixunshiren caixunshiren merged commit a53f8dd into main Jan 31, 2025
205 of 245 checks passed
@caixunshiren caixunshiren deleted the xuncai/all-gather-fixes branch January 31, 2025 15:26
nikileshx pushed a commit to nikileshx/tt-metal that referenced this pull request Feb 3, 2025
### Ticket
This PR fixes two issues exposed when integrating ccl async to TG llama:
- Program caching does not handle input tensor specs properly
- All gather does not handle subdevice properly

### Checklist
- [x] Post commit CI passes:
https://github.com/tenstorrent/tt-metal/actions/runs/13059537328
- [x] TG Nightly:
https://github.com/tenstorrent/tt-metal/actions/runs/13059546044
- [x] TG unit frequent:
https://github.com/tenstorrent/tt-metal/actions/runs/13059554924
- [x] T3K:
https://github.com/tenstorrent/tt-metal/actions/runs/13059564134

---------

Co-authored-by: avoraTT <[email protected]>
hschoi4448 pushed a commit that referenced this pull request Feb 20, 2025
### Ticket
This PR fixes two issues exposed when integrating ccl async to TG llama:
- Program caching does not handle input tensor specs properly
- All gather does not handle subdevice properly

### Checklist
- [x] Post commit CI passes:
https://github.com/tenstorrent/tt-metal/actions/runs/13059537328
- [x] TG Nightly:
https://github.com/tenstorrent/tt-metal/actions/runs/13059546044
- [x] TG unit frequent:
https://github.com/tenstorrent/tt-metal/actions/runs/13059554924
- [x] T3K:
https://github.com/tenstorrent/tt-metal/actions/runs/13059564134

---------

Co-authored-by: avoraTT <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants