[ROCm][CI] Stage C mirrors#42793
Conversation
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
There was a problem hiding this comment.
Code Review
This pull request updates the Buildkite CI configuration to expand AMD GPU testing support for MI300 and MI325 hardware. Key changes include adding AMD mirror configurations across multiple test areas—such as attention, engine, entrypoints, and speculative decoding—and introducing new integration tests for speculators and hidden states. Reviewers suggested marking the new "Extract Hidden States Integration" test as optional to avoid CI bottlenecks and recommended updating parent step dependencies to include ROCm-specific files, ensuring that relevant code changes properly trigger the mirrored AMD tests.
| @@ -13,6 +13,20 @@ steps: | |||
| - tests/v1/attention | |||
There was a problem hiding this comment.
The parent step's source_file_dependencies should include all files that the mirrors depend on. Currently, changes to vllm/_aiter_ops.py, vllm/envs.py, or vllm/platforms/rocm.py will not trigger this step, and consequently, the AMD mirror will not run for those changes.
- tests/v1/attention
- vllm/_aiter_ops.py
- vllm/envs.py
- vllm/platforms/rocm.py| @@ -149,3 +149,21 @@ steps: | |||
| - vllm/model_executor/models/whisper.py | |||
There was a problem hiding this comment.
The parent step is missing several dependencies that are explicitly required by the AMD mirror (e.g., vllm/platforms/rocm.py, vllm/_aiter_ops.py). If these files are modified, the parent step will not trigger, preventing the AMD mirror from running. The parent's source_file_dependencies should be the union of all dependencies across mirrors.
- vllm/model_executor/models/whisper.py
- vllm/model_executor/layers/
- vllm/v1/attention/backends/
- vllm/v1/attention/selector.py
- vllm/_aiter_ops.py
- vllm/platforms/rocm.py
- vllm/model_executor/model_loader/| @@ -36,6 +36,21 @@ steps: | |||
| - tests/v1/e2e/spec_decode/ | |||
| @@ -60,6 +75,20 @@ steps: | |||
| - tests/v1/e2e/spec_decode/ | |||
| @@ -71,6 +100,20 @@ steps: | |||
| - tests/v1/e2e/spec_decode/ | |||
Mirrored/gated test groups:
Agent count: 6 x mi300_1, 2 x mi325_1
Notes:
Added
Speculators CorrectnessandExtract Hidden States Integrationintest-amd.yamlundermi300_1.cc @kenroche