Commit 3e6b05e
committed
[DeepSeek] Fix weight_dequant kwargs in fixup_moe_expert_amax
weight_dequant(x, s, block_size=128, dtype=...) — the third positional arg
is block_size, not dtype. Passing torch.bfloat16 there sets block_size to
the dtype object, which would either fail inside the triton kernel or
compute amax over corrupt blocks for any uncalibrated expert.
The bug never fired in our validation run because every expert was
activated during calibration (top-k over 256 experts × 1024 samples), so
the _missing(wq.amax) branch was dead. Spotted by bot review on PR #1380.
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>1 parent f5d57cb commit 3e6b05e
1 file changed
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
290 | 290 | | |
291 | 291 | | |
292 | 292 | | |
293 | | - | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
294 | 296 | | |
295 | 297 | | |
296 | 298 | | |
| |||
0 commit comments