Skip to content

[CodeGen][CUDA] Move fast math intrinsic lowering option to PassContext#19596

Open
tlopex wants to merge 2 commits into
apache:mainfrom
tlopex:pass1
Open

[CodeGen][CUDA] Move fast math intrinsic lowering option to PassContext#19596
tlopex wants to merge 2 commits into
apache:mainfrom
tlopex:pass1

Conversation

@tlopex
Copy link
Copy Markdown
Member

@tlopex tlopex commented May 24, 2026

This updates CUDA fast math intrinsic lowering to use a PassContext option instead of a CUDA Target attribute.

The new option is:

with tvm.transform.PassContext(config={"tirx.enable_fast_math": True}):
    ...

When unset or false, CUDA math intrinsics continue to lower to the precise CUDA math functions such as expf. When true, tirx.LowerIntrin prioritizes the cuda.fastmath.* lowering rules, producing fast math intrinsics such as __expf.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the enable_fast_math setting by moving it from a CUDA target attribute to a global pass configuration option (tirx.enable_fast_math). The changes include removing the attribute from target detection, registration, and the CUDA target kind definition, while updating the LowerIntrin pass and associated tests to utilize the PassContext for this configuration. I have no feedback to provide as there were no review comments to evaluate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant