Skip to content

Fix path reconstruction in ModelZoo download#44

Open
C-Achard wants to merge 5 commits intoDeepLabCut:mainfrom
C-Achard:cy/fix-modelzoo-download
Open

Fix path reconstruction in ModelZoo download#44
C-Achard wants to merge 5 commits intoDeepLabCut:mainfrom
C-Achard:cy/fix-modelzoo-download

Conversation

@C-Achard
Copy link

@C-Achard C-Achard commented Mar 24, 2026

Scope

This PR aims to fix DeepLabCut/DeepLabCut#3256, which pointed at an inexistent cache folder due to incorrect reconstruction.

Underlying issue

Previously, after calling hf_hub_download(), the code rebuilt the expected cache location under models--.../snapshots/<commit>/... instead of using the path returned by the download function itself. Since hf_hub_download() already returns the resolved local file path, reconstructing the cache layout was unnecessary and could point to a non-existent location.

Fix

Instead, this patch uses the path returned by the hf download function itself and copies it to the target directory, skipping the need for reconstruction.

Also adds tests for the newly modified code that cover behavior more extensively.

TODO

  • Tests run locally
  • Download successful

Replace os.rename with shutil.copy2 when handling downloaded files to avoid cross-filesystem rename errors and to preserve metadata. Ensure target_dir exists, use the path returned by hf_hub_download instead of assuming cache folder layout, normalize rename_mapping into a mapping, and call _handle_downloaded_file with the actual downloaded file path. Track and remove the HF cache folder safely (ignore_errors) only if present. These changes make HuggingFace model downloads more reliable across environments.
@C-Achard C-Achard changed the title Fix path issue in ModelZoo download Fix path reconstruction in ModelZoo download Mar 24, 2026
Use a temporary directory for HuggingFace cache (tempfile.TemporaryDirectory) so HF cache folders are not created under the target_dir. Improve tar.gz handling in _handle_downloaded_file by extracting members with tar.extractfile and shutil.copyfileobj (and fall back to shutil.copy2 for non-tar files), and remove the previous symlink-resolution and explicit HF-folder removal logic. Update hf_hub_download call to use repo_id/filename named args and thread per-file rename mappings through to the extractor. Adjusted tests to reflect that the HF cache is no longer created inside the target directory and to ensure the final artifact still exists.
Delete the remove_hf_folder parameter from download_huggingface_model (and its docstring) and remove the related unit test. Also modify the tar extraction logic in _handle_downloaded_file by switching the TarInfo check from member.isdir() to member.isfile(), changing which tar members are processed during extraction.
Open archives with a permissive compression mode (r:*) and only extract regular files. Skip members with empty basenames or where extractfile() returns None, use a context manager for file streams, and raise a ReadError if no regular files were extracted to fail loudly. Preserve previous fallback: if the file isn't an archive, copy it as a direct model file. Also update example usage to remove the explicit remove_hf_folder argument.
@C-Achard
Copy link
Author

Update : made breaking changes by removing the remove_hf_folder arg, which is currently unused in DLC and is no longer necessary as the cache dir is no longer the target dir.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

superanimal_topviewmouse detector

1 participant