Commit 78f4c42
committed
chore(ptq): drop --exclude_modules CLI flag (recipes own exclusions)
The `--exclude_modules` flag was added in this PR as an escape hatch for
overriding the auto-applied lm_head/embedding inclusion on Nemotron-H. Now
that meenchen's recipe-system review is addressed and the Nemotron-H
extensions live in `modelopt_recipes/models/Nemotron-H/nvfp4_w4a16.yaml`,
this flag has no remaining purpose: users who want different exclusions
write a different recipe.
Removes:
* the `--exclude_modules` argparse entry in `hf_ptq.py`
* the `args.exclude_modules` apply-loop in `quantize_main()`
* the `EXCLUDE_MODULES` env-var passthrough + `EXCLUDE_MODULES_ARGS` bash
array in `examples/llm_ptq/scripts/huggingface_example.sh`
Verified end-to-end on `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16` with
`--recipe models/Nemotron-H/nvfp4_w4a16` (transformers 4.56.2, GPU 5,
calib_size=16): same coverage as before — 94 weight quantizers enabled,
21 disabled (the Mamba `*mixer.conv1d*` layers); `lm_head.weight_quantizer`
and `backbone.embeddings.weight_quantizer` carry NVFP4 W4A16 cfg;
exported safetensors 2.1 GiB; `hf_quant_config.json` reports
`quant_algo=NVFP4_W4A16`, `group_size=16`, `exclude_modules=[21 conv1d
layers]`. The recipe still dictates the exclusion set, so behavior is
unchanged for the supported codepath.
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>1 parent e63965e commit 78f4c42
2 files changed
Lines changed: 0 additions & 36 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1128 | 1128 | | |
1129 | 1129 | | |
1130 | 1130 | | |
1131 | | - | |
1132 | | - | |
1133 | | - | |
1134 | | - | |
1135 | | - | |
1136 | | - | |
1137 | | - | |
1138 | | - | |
1139 | | - | |
1140 | | - | |
1141 | | - | |
1142 | | - | |
1143 | 1131 | | |
1144 | 1132 | | |
1145 | 1133 | | |
| |||
1338 | 1326 | | |
1339 | 1327 | | |
1340 | 1328 | | |
1341 | | - | |
1342 | | - | |
1343 | | - | |
1344 | | - | |
1345 | | - | |
1346 | | - | |
1347 | | - | |
1348 | | - | |
1349 | | - | |
1350 | | - | |
1351 | | - | |
1352 | 1329 | | |
1353 | 1330 | | |
1354 | 1331 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | 130 | | |
143 | 131 | | |
144 | 132 | | |
| |||
195 | 183 | | |
196 | 184 | | |
197 | 185 | | |
198 | | - | |
199 | 186 | | |
200 | 187 | | |
201 | 188 | | |
| |||
0 commit comments