Skip to content

Use device_map="auto" in single file tests to support large models on limited GPU memory#13816

Open
jiqing-feng wants to merge 2 commits into
huggingface:mainfrom
jiqing-feng:flux
Open

Use device_map="auto" in single file tests to support large models on limited GPU memory#13816
jiqing-feng wants to merge 2 commits into
huggingface:mainfrom
jiqing-feng:flux

Conversation

@jiqing-feng
Copy link
Copy Markdown
Contributor

Problem

Single file loading tests (SingleFileTesterMixin) used device=torch_device or device_map=torch_device, forcing the entire model onto a single GPU. For large models like FLUX.1-dev (~12B params, ~24GB in bf16), this fails on single 24GB GPUs — especially test_single_file_model_config which loads two models simultaneously.

Changes

tests/models/testing_utils/single_file.py

  • test_single_file_model_config: device=torch_devicedevice_map="auto"
  • test_single_file_model_parameters: device_map=str(torch_device) / device=torch_devicedevice_map="auto"
  • test_single_file_loading_with_device_map: device_map=torch_devicedevice_map="auto"

tests/models/transformers/test_models_transformer_flux.py

  • TestFluxSingleFile: added torch_dtype = torch.bfloat16 to halve memory usage

tests/single_file/test_model_flux_transformer_single_file.py

  • test_device_map_cudatest_device_map_auto: device_map="cuda"device_map="auto", added torch_dtype=torch.bfloat16

Why device_map="auto" instead of CPU offload

enable_model_cpu_offload() is a pipeline-level API, not available for individual model from_single_file loading. device_map="auto" is the model-level solution — accelerate automatically places weights on GPU and offloads the rest to CPU RAM when GPU memory is insufficient.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@github-actions github-actions Bot added size/M PR with diff < 200 LOC tests and removed size/M PR with diff < 200 LOC labels May 27, 2026
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@github-actions github-actions Bot added the size/S PR with diff < 50 LOC label May 27, 2026
@jiqing-feng
Copy link
Copy Markdown
Contributor Author

Hi @sayakpaul . Would you please review this PR? Thanks!

@sayakpaul sayakpaul requested a review from DN6 May 27, 2026 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S PR with diff < 50 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant