Commit 70aa572
committed
feat: enable dynamic LoRA adapter loading on gpt-oss-120b
Adds --enable-lora with max-loras=4, max-lora-rank=64, plus
VLLM_ALLOW_RUNTIME_LORA_UPDATING=true and a /adapters mount
backed by a new gpt-oss-120b-adapters PVC (50Gi RWX nfs-csi).
max-loras=4 fits comfortably on the H100 (80GB).
Pins the vLLM image to the same digest as the ministral commit:
sha256:04563c302537a91aa49ebdfbceda96111c5712275999b7e8804fa598f0b5641d
DO NOT APPLY without first running Phase 0.3 from the rollout
plan: gpt-oss-120b is MoE; vLLM LoRA support for MoE has been
incremental and must be verified against the pinned digest before
this lands in prod. Acceptable evidence is either a successful
runtime /v1/load_lora_adapter against a small public adapter or
a release-note confirmation for the pinned vLLM build.
If MoE+LoRA turns out to be unsupported, revert just this commit;
the ministral rollout and shared infra (template, docs, litellm
configmap) stay intact.1 parent 72ce9c0 commit 70aa572
2 files changed
Lines changed: 30 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
36 | 45 | | |
37 | 46 | | |
38 | 47 | | |
| |||
45 | 54 | | |
46 | 55 | | |
47 | 56 | | |
| 57 | + | |
| 58 | + | |
48 | 59 | | |
49 | 60 | | |
50 | 61 | | |
51 | 62 | | |
52 | 63 | | |
| 64 | + | |
| 65 | + | |
53 | 66 | | |
54 | 67 | | |
55 | 68 | | |
| |||
66 | 79 | | |
67 | 80 | | |
68 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
0 commit comments