[5763424][ONNX][Autocast] Fix ConstantOfShape layer output precision (#789)

gcunhase · web-flow · commit c1956b8e2b69 · 2026-01-16T14:02:36.000-08:00
## What does this PR do? **Type of change:** Bug fix **Overview:** Fixed the output precision of ConstantOfShape layers in models with custom ops. ## Usage  ```python $ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.onnx ``` ## Testing See bug 5763424. ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: No  ## Additional Information This issue only affects models with custom ops.  ## Summary by CodeRabbit * **Bug Fixes** * Improved type propagation handling for ConstantOfShape operations in ONNX autocast, ensuring correct precision type conversion across related operations. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>  Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
diff --git a/modelopt/onnx/autocast/precisionconverter.py b/modelopt/onnx/autocast/precisionconverter.py
@@ -347,6 +347,8 @@ def _get_np_type(node, inp, opset=onnx.defs.onnx_opset_version()):
                 return node.inputs[1].dtype  # scale type
             elif node.op == "QuantizeLinear":
                 return node.inputs[2].dtype  # zero_point type
+            elif node.op == "ConstantOfShape":
+                return node.attrs["value"].dtype
             elif not inp.dtype or inp.dtype == onnx.TensorProto.UNDEFINED:
                 return None
             elif node.op not in self.custom_ops: