Commit 4ebc3bf
[OMNIML-5003] Restrict non-gated detection to single up_proj (review)
Address review feedback:
- _fused_experts_wrapper_class now claims _QuantNonGatedFusedExperts only for a
3-D up_proj with no gate_proj and no gate_up_proj. A split-gated container
(separate 3-D gate_proj/up_proj/down_proj, three F.linear calls per expert)
falls through to None/unsupported instead of being mis-wrapped, since the
two-call toggle and up_proj-storage index recovery assume exactly two calls.
- Add test_split_gated_layout_not_claimed_as_nongated and
test_get_quant_config_resolves_nongated_experts (down_proj anchors format /
has_quantizers detection, so the produced quant config is correct).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>1 parent 7f22c90 commit 4ebc3bf
2 files changed
Lines changed: 61 additions & 4 deletions
File tree
- modelopt/torch/quantization/plugins
- tests/unit/torch/quantization/plugins
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1502 | 1502 | | |
1503 | 1503 | | |
1504 | 1504 | | |
1505 | | - | |
1506 | | - | |
| 1505 | + | |
| 1506 | + | |
1507 | 1507 | | |
1508 | 1508 | | |
1509 | 1509 | | |
| |||
1518 | 1518 | | |
1519 | 1519 | | |
1520 | 1520 | | |
1521 | | - | |
| 1521 | + | |
| 1522 | + | |
| 1523 | + | |
| 1524 | + | |
| 1525 | + | |
| 1526 | + | |
| 1527 | + | |
| 1528 | + | |
1522 | 1529 | | |
1523 | 1530 | | |
1524 | 1531 | | |
| |||
Lines changed: 51 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
1191 | 1191 | | |
1192 | 1192 | | |
1193 | 1193 | | |
| 1194 | + | |
| 1195 | + | |
| 1196 | + | |
| 1197 | + | |
| 1198 | + | |
| 1199 | + | |
| 1200 | + | |
| 1201 | + | |
| 1202 | + | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
| 1208 | + | |
| 1209 | + | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
| 1213 | + | |
| 1214 | + | |
| 1215 | + | |
| 1216 | + | |
| 1217 | + | |
| 1218 | + | |
| 1219 | + | |
| 1220 | + | |
| 1221 | + | |
| 1222 | + | |
| 1223 | + | |
| 1224 | + | |
| 1225 | + | |
| 1226 | + | |
| 1227 | + | |
| 1228 | + | |
| 1229 | + | |
| 1230 | + | |
| 1231 | + | |
| 1232 | + | |
| 1233 | + | |
| 1234 | + | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
| 1242 | + | |
| 1243 | + | |
0 commit comments