Apply Chromas modulation pruning to more flux models #990

Green-Sky · 2025-11-20T10:18:29Z

Green-Sky
Nov 20, 2025

So I tried transplanting the distilled guidance from chroma (I went for the first available model Chroma-v2.5) to freepiks flux.1-lite-8B. I also hacked some stuff in sd.cpp to make it work somewhat, but it only works a little.

Reasons I can think of that could cause this:

chroma is based on some version of schnell while flux.1-lite-8B is based on dev
the finetuning of either or both of chroma and flux.1-lite-8B diverged too much
- loras for flux.1-dev mostly work for flux.1-lite-8B, so thats less likely to be a reason
freepiks model might require the clip conditioner, which we remove in sd.cpp since we detect chroma
other things I am not thinking of

splicing details

import torch
import safetensors
import safetensors.torch

tensors = {}

# start with chroma distilled guidance
with safetensors.safe_open("models/chroma-v2.5.safetensors", framework="pt") as f:
    for k in f.keys():
        if k.startswith("distilled_guidance_layer"):
            tensors[k] = f.get_tensor(k)
            # TODO: strip out double blocks 5-15

# add rest of freepik flux light
with safetensors.safe_open("models/flux.1-lite-8B.safetensors", framework="pt") as f:
    for k in f.keys():
        # final_layer.adaLN_modulation ??
        if not k.startswith("time_in") and not k.startswith("vector_in") and not k.startswith("guidance_in") and not "img_mod" in k and not "txt_mod" in k and not "modulation" in k:
            tensors[k] = f.get_tensor(k)

for k in tensors.keys():
    print(k)

# saving time
print("saving to disk...")
safetensors.torch.save_file(tensors, "spliced.safetensors")

print("done.")

Also needs sd.cpp changes to compensate it not removing the middle blocks modulation guidance.

diff --git a/flux.hpp b/flux.hpp
index 2f85cf8c..38b49f87 100644
--- a/flux.hpp
+++ b/flux.hpp
@@ -234,18 +234,26 @@ namespace Flux {
         std::vector<ModulationOut> get_distil_img_mod(GGMLRunnerContext* ctx, struct ggml_tensor* vec) {
             // TODO: not hardcoded?
             const int single_blocks_count = 38;
-            const int double_blocks_count = 19;
+            const int double_blocks_count = 8;

-            int64_t offset = 6 * idx + 3 * single_blocks_count;
+                       auto actual_mod_idx = idx;
+                       if (idx >= 5 && idx <= 15) {
+                               actual_mod_idx += 11;
+                       }
+            int64_t offset = 6 * actual_mod_idx + 3 * single_blocks_count;
             return {ModulationOut(ctx, vec, offset), ModulationOut(ctx, vec, offset + 3)};
         }

         std::vector<ModulationOut> get_distil_txt_mod(GGMLRunnerContext* ctx, struct ggml_tensor* vec) {
             // TODO: not hardcoded?
             const int single_blocks_count = 38;
-            const int double_blocks_count = 19;
+            const int double_blocks_count = 8;

-            int64_t offset = 6 * idx + 6 * double_blocks_count + 3 * single_blocks_count;
+                       auto actual_mod_idx = idx;
+                       if (idx >= 5 && idx <= 15) {
+                               actual_mod_idx += 11;
+                       }
+            int64_t offset = 6 * actual_mod_idx + 6 * double_blocks_count + 3 * single_blocks_count;
             return {ModulationOut(ctx, vec, offset), ModulationOut(ctx, vec, offset + 3)};
         }

What really should be done what lodestone did for chroma, but the code was never published I think. ( https://huggingface.co/lodestones/Chroma/discussions/12 )

Or the reverse can be done too, where we prune some middle layers from chroma. flux.1-lite-8B removed double blocks 5-15 and retrained 4 (or all later?).
Which would also be nicer license and feature wise.

stduhpf · 2025-11-20T11:12:31Z

stduhpf
Nov 20, 2025

I think pruning layers from Chroma might work better than pruning modulation out of a flux lite model. And the reason I think that is because Chroma training took a lot longer than Flux Lite's.

It might be worth trying what you did in Flex.1 alpha instead of Flux.1 Lite. Both are 8B, but I'm pretty sure Flex.1 is based on Flux Schnell.

3 replies

Green-Sky Nov 20, 2025
Author

Flex.1-alpha has the same double blocks pruned 👍.
It looks very similar, sadly.

I think pruning layers from Chroma might work better than pruning modulation out of a flux lite model. And the reason I think that is because Chroma training took a lot longer than Flux Lite's.

I don't think so. lodestone even says this in the readme:

The whole replacement process only took a day on my single 3090, and after that, the model size was reduced to just 8.9B.

Green-Sky Nov 20, 2025
Author

More of that quote:

But after a simple experiment of zeroing these pooled vectors out, the model’s output barely changed—which made pruning a breeze! Why? Because the only information left for this layer to encode is just a single number in the range of 0-1. Yes, you heard it right—3.3B parameters were used to encode 8 bytes of float values. So this was the most obvious layer to prune and replace with a simple FFN. The whole replacement process only took a day on my single 3090, and after that, the model size was reduced to just 8.9B.

Green-Sky Nov 20, 2025
Author

edit: I forgot to set cfg (to 3.5) for flex, still almost the same though.

Another thing here is that flex.1-alpha is based on a finetune/dedistill of flux.1-schnell already.

I will put the guidance/modulation path on hold for now and do some double-block skipping tests similar to https://ostris.com/2024/09/07/skipping-flux-1-dev-blocks/ .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Apply Chromas modulation pruning to more flux models #990

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Apply Chromas modulation pruning to more flux models #990

Uh oh!

Green-Sky Nov 20, 2025

Replies: 1 comment · 3 replies

Uh oh!

stduhpf Nov 20, 2025

Uh oh!

Green-Sky Nov 20, 2025 Author

Uh oh!

Green-Sky Nov 20, 2025 Author

Uh oh!

Green-Sky Nov 20, 2025 Author

Green-Sky
Nov 20, 2025

Replies: 1 comment 3 replies

stduhpf
Nov 20, 2025

Green-Sky Nov 20, 2025
Author

Green-Sky Nov 20, 2025
Author

Green-Sky Nov 20, 2025
Author