Commit d541324
authored
Disable QKV NVFP4 quantization for Qwen3 MOE (#735)
## What does this PR do?
**Type of change:** ? Recipe improvement
**Overview:** ?
Disable QKV NVFP4 quantization for Qwen3 MOE models following the Qwen3
Next recipe for accuracy recovery
## Testing
Model accuracy benchmarking
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>1 parent b655321 commit d541324
2 files changed
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
180 | 180 | | |
181 | 181 | | |
182 | 182 | | |
183 | | - | |
| 183 | + | |
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
0 commit comments