fix(mm): support diffusers FLUX LoRAs on NF4/8-bit quantized base models by Pfannkuchensack · Pull Request #9118 · invoke-ai/InvokeAI

Pfannkuchensack · 2026-05-03T22:29:46Z

Summary

CustomInvokeLinearNF4 and CustomInvokeLinear8bitLt were missing the _cast_weight_bias_for_input / _cast_tensor_for_input methods that the sidecar-patches branch in autocast_linear_forward_sidecar_patches calls. This caused an AttributeError whenever a non-LoRALayer/FluxControlLoRALayer patch (e.g. MergedLayerPatch produced by the diffusers FLUX LoRA converter for fused Q/K/V/mlp into linear1) was applied to a quantized FLUX module.

The weight is exposed as a meta-device tensor with the correct logical shape (read from quant_state for Params4bit, since .shape reports the packed-byte layout). Shape-only patches (LoRA, LoHA, MergedLayerPatch) work; SetParameterLayer / DoRA on quantized modules remain unsupported.

Related Issues / Discussions

https://discord.com/channels/1020123559063990373/1500616847106506752

QA Instructions

Download the Lora from here and try to run it with a flux dev model

Merge Plan

Standard merge.

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

CustomInvokeLinearNF4 and CustomInvokeLinear8bitLt were missing the _cast_weight_bias_for_input / _cast_tensor_for_input methods that the sidecar-patches branch in autocast_linear_forward_sidecar_patches calls. This caused an AttributeError whenever a non-LoRALayer/FluxControlLoRALayer patch (e.g. MergedLayerPatch produced by the diffusers FLUX LoRA converter for fused Q/K/V/mlp into linear1) was applied to a quantized FLUX module. The weight is exposed as a meta-device tensor with the correct logical shape (read from quant_state for Params4bit, since .shape reports the packed-byte layout). Shape-only patches (LoRA, LoHA, MergedLayerPatch) work; SetParameterLayer / DoRA on quantized modules remain unsupported.

Pfannkuchensack requested review from JPPhoto, blessedcoolant, dunkeroni and lstein as code owners May 3, 2026 22:29

github-actions Bot added python PRs that change python files backend PRs that change backend files labels May 3, 2026

Pfannkuchensack added 2 commits May 5, 2026 04:34

Merge branch 'main' into fix/flux-nf4-merged-lora-attribute-error

d74294f

Merge branch 'main' into fix/flux-nf4-merged-lora-attribute-error

4e32d29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mm): support diffusers FLUX LoRAs on NF4/8-bit quantized base models#9118

fix(mm): support diffusers FLUX LoRAs on NF4/8-bit quantized base models#9118
Pfannkuchensack wants to merge 3 commits intoinvoke-ai:mainfrom
Pfannkuchensack:fix/flux-nf4-merged-lora-attribute-error

Pfannkuchensack commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Pfannkuchensack commented May 3, 2026

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant