Context
From PR #872 review feedback by @xieofxie:
consider also apply to qdq and vice versa
a trick is get all types from input model to get the reverse list
Currently fp16_keep_io_types and fp16_op_block_list in WinMLQuantizationConfig are only used when mode="fp16". The suggestion is to:
- Allow FP16 conversion settings (like
op_block_list) to also apply when running QDQ quantization (e.g., to skip certain ops from quantization).
- Allow QDQ-specific settings to inform FP16 conversion (e.g., auto-derive the block list from the model's op types).
Proposed Approach
- Inspect all op types in the input model to auto-generate a sensible block/allow list
- Share relevant config fields across FP16 and QDQ paths where applicable
- Keep backward compatibility (explicit user settings always take priority)
Related
Context
From PR #872 review feedback by @xieofxie:
Currently
fp16_keep_io_typesandfp16_op_block_listinWinMLQuantizationConfigare only used whenmode="fp16". The suggestion is to:op_block_list) to also apply when running QDQ quantization (e.g., to skip certain ops from quantization).Proposed Approach
Related
src/winml/modelkit/quant/config.py—WinMLQuantizationConfig