Skip to content

fix: prevent F.linear from saving dequantized weights

3670b8f
Select commit
Loading
Failed to load commit list.
Open

fix: prevent F.linear from saving dequantized weights in MatMul4Bit/MatMul8bitLt to save ~13GB VRAM and prevent OOM errors #1935

fix: prevent F.linear from saving dequantized weights
3670b8f
Select commit
Loading
Failed to load commit list.