Skip to content

[cuda backend] skip fully-masked KV blocks calculation in SDPA#20198

Open
Gasoonjia wants to merge 1 commit into
g4-opt-int4-vecloadfrom
g4-opt-prefill-window-sdpa
Open

[cuda backend] skip fully-masked KV blocks calculation in SDPA#20198
Gasoonjia wants to merge 1 commit into
g4-opt-int4-vecloadfrom
g4-opt-prefill-window-sdpa

[cuda][prefill] window-aware SDPA: skip fully-masked KV blocks (idea #1)

ac2968b
Select commit
Loading
Failed to load commit list.

Select a check to view from the sidebar