-
Notifications
You must be signed in to change notification settings - Fork 24.2k
CUDA 12.4 ARM wheel integration to CD - nightly build #126174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126174
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 42bb184 with merge base f0366de ( UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
f8dc9c3
to
e717e5a
Compare
.github/workflows/generated-linux-aarch64-binary-manywheel-nightly.yml
Outdated
Show resolved
Hide resolved
Trying to explore options to help land: pytorch/pytorch#126174 Current [manywheel-py3_9-cuda-aarch64-build / build](https://github.com/pytorch/pytorch/actions/runs/9112985273/job/25053413689?pr=126174#logs) Takes around 6hrs (building only sm90 arch). Hence trying to bring slightly bigger worker, see if we can bring build time to manageable time 3-3.5hrs.
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
503eaca
to
d2cdca7
Compare
.github/workflows/generated-linux-aarch64-binary-manywheel-nightly.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look great!
@@ -71,12 +71,19 @@ jobs: | |||
{%- if config.pytorch_extra_install_requirements is defined and config.pytorch_extra_install_requirements|d('')|length > 0 %} | |||
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: !{{ config.pytorch_extra_install_requirements }} | |||
{%- endif %} | |||
{%- if config["gpu_arch_type"] == "cuda-aarch64" %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is good, just need to remove else condition with timeout minutes 210
@pytorchbot rebase |
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 2 checks: windows-binary-wheel / wheel-py3_8-cpu-test, windows-binary-conda / conda-py3_8-cuda12_1-test Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
the 3-10 aarch64 failure is due to timeout (6 hrs limit), could be due to network condition, can increase the timeout-minutes accordingly later. Merging as all the rest aarch64 builds are passing. Windows build failure is unrelated. |
Merge failedReason: 1 jobs have failed, first few of them are: linux-aarch64-binary-manywheel / manywheel-py3_11-cuda-aarch64-build / build Details for Dev Infra teamRaised by workflow job |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Rebase failed due to Command
Raised by https://github.com/pytorch/pytorch/actions/runs/9229584549 |
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 0 checks: Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
rebasing pytorch#124112. too many conflict files, so starting a new PR. Test pytorch/builder#1775 (merged) for ARM wheel addition Test pytorch/builder#1828 (merged) for setting MAX_JOBS Current issue to follow up: pytorch#126980 Co-authored-by: Aidyn-A <[email protected]> Pull Request resolved: pytorch#126174 Approved by: https://github.com/nWEIdia, https://github.com/atalman
rebasing #124112.
too many conflict files, so starting a new PR.
Test pytorch/builder#1775 (merged) for ARM wheel addition
Test pytorch/builder#1828 (merged) for setting MAX_JOBS
Current issue to follow up:
#126980
cc @atalman @malfet @ptrblck @nWEIdia @Aidyn-A