From 46491403bcc794a889103bc0a13329d06454227f Mon Sep 17 00:00:00 2001 From: Ruifeng Zheng Date: Tue, 1 Aug 2023 10:27:43 +0800 Subject: [PATCH] [SPARK-44586][INFRA][ML][PYTHON] TorchDistributor` should install cpu-only Torch for testing ### What changes were proposed in this pull request? refering to https://pytorch.org/get-started/locally/ , existing command install PyTorch with CUDA in Linux ![image](https://github.com/apache/spark/assets/7322292/7957524b-8331-495b-ae8e-feda0a6934af) we should switch to cpu-only PyTorch, since the CUDA takes too much disk space. ![image](https://github.com/apache/spark/assets/7322292/4cd18513-5eb1-4522-acea-689c5f67e6fc) ### Why are the changes needed? We don't have any GPU in CI, it makes no sense to install the huge CUDA, see https://github.com/zhengruifeng/spark/actions/runs/5692361308/job/15429270610 ![image](https://github.com/apache/spark/assets/7322292/56cd8b5b-5acb-483a-972d-ee467b7d5461) ![image](https://github.com/apache/spark/assets/7322292/8ead5f99-95c4-4d0d-8bcc-d7452b4986e1) this PR can save 3.3G disk space **[before this PR](https://github.com/zhengruifeng/spark/actions/runs/5692361308/job/15429270610)** ![image](https://github.com/apache/spark/assets/7322292/44bd77dd-a73e-4f92-a4ad-c3221265d3d3) **[after this PR](https://github.com/zhengruifeng/spark/actions/runs/5692416121/job/15444345860)** ![image](https://github.com/apache/spark/assets/7322292/54a80cf9-366b-42b3-8de1-76ebf1de1607) ### Does this PR introduce _any_ user-facing change? No, test-only ### How was this patch tested? updated CI Closes #42210 from zhengruifeng/infra_torch_cpuonly. Authored-by: Ruifeng Zheng Signed-off-by: Ruifeng Zheng --- dev/infra/Dockerfile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile index af8e1a980f93c..9d7b29e25b49b 100644 --- a/dev/infra/Dockerfile +++ b/dev/infra/Dockerfile @@ -71,4 +71,5 @@ RUN python3.9 -m pip install numpy pyarrow 'pandas<=2.0.3' scipy unittest-xml-re RUN python3.9 -m pip install grpcio protobuf googleapis-common-protos grpcio-status # Add torch as a testing dependency for TorchDistributor -RUN python3.9 -m pip install torch torchvision torcheval +RUN python3.9 -m pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu +RUN python3.9 -m pip install torcheval