System Info
5.0.0rc1, 2.9.1
Who can help?
No response
Information
Tasks
Reproduction
/.../.venv/lib/python3.12/site-packages/torch/_inductor/compile_fx.py:312: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance.
warnings.warn(
model = transformers.AutoModelForCausalLM.from_pretrained("PleIAs/Monad", attn_implementation = 'sdpa', dtype = torch.bfloat16)
model = torch.compile(model, mode = 'reduce_overhead')
model(...)
Is this warning because the model still does some fp32 matmul?
Expected behavior
no warning when bfloat16 model is used