[NOT FOR LAND] cuda benchmark without triton #3509
cuda.yml
on: pull_request
Matrix: export-model-cuda-artifact
Matrix: test-cuda-builds
Matrix: test-models-cuda
Matrix: test-model-cuda-e2e
check-all-cuda-builds
3s
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
google-gemma-3-4b-it-cuda-non-quantized
Expired
|
7.22 GB |
sha256:864a90bf67ab339acc6ccdb2c27b4f3c345c9711d45e92a47e51987b925afea9
|
|
|
google-gemma-3-4b-it-cuda-quantized-int4-tile-packed
Expired
|
4.03 GB |
sha256:e74ca8dafe80490e28e3d8ecd3ad68c89709b4f261f0844513b0fcaba5989bd3
|
|
|
mistralai-Voxtral-Mini-3B-2507-cuda-non-quantized
Expired
|
6.82 GB |
sha256:fa7f0842653d72c8779783ca43c39aca4302422be23f69dec609429ec890b0f9
|
|
|
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-tile-packed
Expired
|
2.89 GB |
sha256:db8f21d05ca24e7bb43951287c30e9578de8bd2b7dc03c35bd68c862de8f382d
|
|
|
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-weight-only
Expired
|
6.14 GB |
sha256:87a524b935e85f2a4a42eea343b13403feca9f62df4ef150d76383f4a74ec025
|
|
|
openai-whisper-large-v3-turbo-cuda-non-quantized
Expired
|
1.17 GB |
sha256:5ed5e88d10f7d04d7a420f846b1a9a431c9915f876f14c60825a7e74b5075364
|
|
|
openai-whisper-large-v3-turbo-cuda-quantized-int4-tile-packed
Expired
|
490 MB |
sha256:6f6dacd4ec2f111cb3154b36aa96e052c0d9ef5e48a76f02e57c56aa29f1735f
|
|
|
openai-whisper-large-v3-turbo-cuda-quantized-int4-weight-only
Expired
|
484 MB |
sha256:e90c97b8e9c068960f3a607d062db9225f5d49a7480ef4b48b2abf4106e086eb
|
|
|
openai-whisper-small-cuda-non-quantized
Expired
|
361 MB |
sha256:c6125bd528011fbd6b9288c83886a4118a946434a687b1f81e3de825487aba48
|
|
|
openai-whisper-small-cuda-quantized-int4-tile-packed
Expired
|
172 MB |
sha256:4bc786f8b221f647ff5653a16baeb37b90b55d2e7110b8d51f96242720154b9c
|
|
|
openai-whisper-small-cuda-quantized-int4-weight-only
Expired
|
270 MB |
sha256:a82a4ca36b8456b459656180ce1de0c0a7701566b438797c2857c38319df62c5
|
|