Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for H200: update 550 CUDA driver with fabric manager topology fix #6120

Merged
merged 4 commits into from
Mar 29, 2025

Conversation

ganeshkumarashok
Copy link
Contributor

@ganeshkumarashok ganeshkumarashok commented Mar 28, 2025

What type of PR is this?
/kind bug

What this PR does / why we need it:
Updates aks-gpu container image for CUDA for H200 GPUs.

Which issue(s) this PR fixes:

Fixes #

Requirements:

Special notes for your reviewer:

Release note:

none

@ganeshkumarashok ganeshkumarashok changed the title Update CUDA driver for H200 fix fix for H200: update 550 CUDA driver with fabric manager topology fix Mar 28, 2025
@ganeshkumarashok
Copy link
Contributor Author

Related PR in aks-gpu: Azure/aks-gpu#96

Copy link

@sfc-gh-raravena sfc-gh-raravena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@ganeshkumarashok
Copy link
Contributor Author

ganeshkumarashok commented Mar 29, 2025

hello @sfc-gh-raravena! sorry for the disruption. hope to see you at KubeCon next week ;)!

Copy link
Contributor

No changes to cached containers or packages on Windows VHDs

@ganeshkumarashok ganeshkumarashok merged commit 20f8dba into master Mar 29, 2025
32 checks passed
@ganeshkumarashok ganeshkumarashok deleted the aganeshkumar/cuda_update_h200 branch March 29, 2025 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants