[Pipelines] Add DreamLite text-to-image and image-edit pipelines by Carlofkl · Pull Request #13815 · huggingface/diffusers

Carlofkl · 2026-05-27T06:16:54Z

Context

This PR integrates DreamLite — ByteDance's text-to-image / image-edit diffusion model — into diffusers, following an invitation from @NielsRogge to release the model on the Hub in diffusers format.

Related issue: ByteVisionLab/DreamLite#3 (comment)

Model cards (public, ungated):

Base (3-branch dual CFG): https://huggingface.co/carlofkl/DreamLite-base
Mobile (distilled, single forward): https://huggingface.co/carlofkl/DreamLite-mobile

Both repos use a diffusers branch (loaded via revision="diffusers") to keep the original ByteDance-internal main branch intact for backward compatibility with existing users.

What's added

src/diffusers/
├── models/unets/unet_dreamlite.py            # DreamLiteUNetModel
├── pipelines/dreamlite/
│   ├── __init__.py
│   ├── pipeline_dreamlite.py                  # DreamLitePipeline (3-branch dual CFG)
│   ├── pipeline_dreamlite_mobile.py           # DreamLiteMobilePipeline (distilled)
│   └── pipeline_output.py
└── (registered in src/diffusers/__init__.py, models/__init__.py,
    pipelines/__init__.py, utils/dummy_*.py)

docs/source/en/api/pipelines/dreamlite.md
tests/pipelines/dreamlite/
├── test_pipeline_dreamlite.py
└── test_pipeline_dreamlite_mobile.py

Architecture highlights

DreamLiteUNetModel — UNet-based denoiser conditioned on Qwen3-VL text/vision embeddings.
DreamLitePipeline — runs 3 forward passes per step (text-cond / image-cond / uncond) and combines them with a dual-CFG schedule for high-fidelity text-to-image and image edit.
DreamLiteMobilePipeline — distilled single-pass variant; no CFG; designed for on-device inference. Pairs with AutoencoderTiny.
Both pipelines use FlowMatchEulerDiscreteScheduler.

Testing

Loading smoke test against carlofkl/DreamLite-base with revision="diffusers" — all 6 sub-modules resolve to the correct diffusers.* namespace.
Inference smoke test — generates a 1024×1024 image in ~0.6s/step on a single A800; output stats sane (std≈93, no NaN/Inf).
Standard pipeline tests in tests/pipelines/dreamlite/.

Before submitting

Did you read the contributor guideline?
Did you read our philosophy doc?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case. → Release DreamLite on Hugging Face ByteVisionLab/DreamLite#3
Did you make sure to update the documentation with your changes? Here are the documentation guidelines.
Did you write any new necessary tests?

Who can review?

cc @sayakpaul @yiyixuxu @DN6 — thanks in advance for the review!

Add ByteDance's DreamLite model family to diffusers. DreamLite is a UNet-based diffusion model that supports both text-to-image generation and reference-image editing through a shared 3-branch dual-CFG design. Two pipelines are shipped: * DreamLitePipeline - full 3-branch dual CFG (negative, reference, prompt); supports T2I and I2I editing at 1024x1024. * DreamLiteMobilePipeline - distilled single-branch variant for on-device inference; no CFG. New model code (all isolated under *_dreamlite.py / unet_dreamlite.py to avoid touching shared upstream files): * models/transformers/transformer_2d_dreamlite.py - DreamLite 2D transformer block. * models/unets/unet_dreamlite.py - DreamLiteUNetModel. * models/unets/unet_2d_blocks_dreamlite.py - DreamLite-specific down/up/mid blocks. * models/resnet_dreamlite.py - DreamLite ResNet variants. * models/attention_processor.py - add DreamLiteAttnProcessor2_0 (pure addition, no existing processor modified). Pipeline + tests + docs: * pipelines/dreamlite/{__init__.py, pipeline_dreamlite.py, pipeline_dreamlite_mobile.py, pipeline_output.py}. * tests/pipelines/dreamlite/{test_pipeline_dreamlite.py, test_pipeline_dreamlite_mobile.py} with the standard PipelineTesterMixin suite; setUp/tearDown auto-patches encode_prompt with a fake so MagicMock text encoders work without per-test boilerplate. * Skip 8 mixin tests that don't apply to DreamLite (MagicMock serialisation, custom attention processor, encode_prompt return shape, batch_size > 1 sweep), mirroring SD3 / Flux conventions. * docs/source/en/api/pipelines/dreamlite.md + _toctree.yml entry (alphabetically between DiT and EasyAnimate). * Register exports in 6 __init__.py files. Two real bugs surfaced by the mixin test suite are fixed in this commit: * num_images_per_prompt > 1: prompt_embeds and text_attention_mask are now repeated along the batch dimension in both pipelines' T2I and I2I branches before being passed to the UNet. * vae=None: __init__ now guards the encoder_block_out_channels lookup so encode_prompt can be tested in isolation per PipelineTesterMixin convention. SlowTests real-checkpoint resolution is set to 1024x1024 (the only size DreamLite is trained for). Test result: 27 passed, 50 skipped, 0 failed on CPU fast suite. make style && make quality: clean.

The `carlofkl/DreamLite-{base,mobile}` Hub repos host two flavours of the same checkpoint: * `main` branch - keeps `model_index.json` pointing at ByteDance's internal package path so the original (non-diffusers) reference code can still load these weights. * `diffusers` branch - rewrites the `unet` entry of `model_index.json` to `["diffusers", "DreamLiteUNetModel"]` so this integration loads correctly from `diffusers`. This commit pins every `from_pretrained(...)` call shipped with the diffusers integration (docs examples, pipeline docstrings, SlowTests) to `revision="diffusers"`. Local-override env vars (DREAMLITE_BASE_PATH / DREAMLITE_MOBILE_PATH) still bypass the revision pin.

…ts after rebase Mechanical changes after rebasing onto current `main`: * `pipeline_dreamlite.py::retrieve_timesteps` — re-synced from `diffusers.pipelines.flux.pipeline_flux.retrieve_timesteps` (PEP 604 type hints, expanded docstring, plus the new `accepts_timesteps` / `accept_sigmas` introspection guards). DreamLite's default code path uses `num_inference_steps` (uniform schedule) and never passes custom `timesteps` / `sigmas`, so the added guards are dead-code for this pipeline — behaviour is unchanged. * `dummy_pt_objects.py` / `dummy_torch_and_transformers_objects.py` — registered the dummy classes auto-generated by `make fix-copies` for `DreamLiteTransformer2DModel`, `DreamLiteUNetModel`, `DreamLitePipeline`, `DreamLiteMobilePipeline`, `DreamLitePipelineOutput`. Generated by `make fix-copies`. No hand edits.

Carlofkl added 3 commits May 27, 2026 11:38

github-actions Bot added size/L PR with diff > 200 LOC documentation Improvements or additions to documentation models tests utils pipelines and removed size/L PR with diff > 200 LOC labels May 27, 2026

Carlofkl mentioned this pull request May 27, 2026

Release DreamLite on Hugging Face ByteVisionLab/DreamLite#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pipelines] Add DreamLite text-to-image and image-edit pipelines#13815

[Pipelines] Add DreamLite text-to-image and image-edit pipelines#13815
Carlofkl wants to merge 3 commits into
huggingface:mainfrom
Carlofkl:feature/dreamlite-integration

Carlofkl commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Carlofkl commented May 27, 2026

Context

What's added

Architecture highlights

Testing

Before submitting

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant