You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cortex-M: Fuse relu activation into quantized_add (#18462)
### Summary
ResNet8 has skip connections with relu(add(conv(x), skip(x))). The
ActivationFusionPass only fused relu into conv/linear, leaving 3 unfused
relu ops that fell through to portable aten::relu.out which incorrectly
clamps int8 tensors to literal 0 instead of the quantized zero_point,
causing numerical mismatches on the FVP.
Add fused activation patterns (relu, hardtanh, clamp) for add/add_ to
quantizer_support.py BINARY_OP_PATTERNS so the quantizer produces
activation-aware quantization bounds. Add aten.add.Tensor to
ActivationFusionPass FUSE_OPS. Update QuantizedOpFusionPass to read
activation bounds from output_qparams and pass them to quantized_add.
Update the quantized_add operator (schema, meta, impl, C++) to accept
activation_min/activation_max parameters.
---------
Co-authored-by: Claude <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: backends/cortex_m/ops/operators.yaml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@
17
17
- arg_meta: null
18
18
kernel_name: cortex_m::dequantize_per_tensor_out
19
19
20
-
- func: cortex_m::quantized_add.out(Tensor self, int self_zero_point, int self_multiplier, int self_shift, Tensor other, int other_zero_point, int other_multiplier, int other_shift, int output_zero_point, int output_multiplier, int output_shift, *, Tensor(a!) out) -> Tensor(a!)
20
+
- func: cortex_m::quantized_add.out(Tensor self, int self_zero_point, int self_multiplier, int self_shift, Tensor other, int other_zero_point, int other_multiplier, int other_shift, int output_zero_point, int output_multiplier, int output_shift, int activation_min, int activation_max, *, Tensor(a!) out) -> Tensor(a!)
0 commit comments