Skip to content

fix windows updater zip root path normalization#8019

Open
zouyonghe wants to merge 5 commits intoAstrBotDevs:masterfrom
zouyonghe:codex/fix-windows-updater-path-normalization
Open

fix windows updater zip root path normalization#8019
zouyonghe wants to merge 5 commits intoAstrBotDevs:masterfrom
zouyonghe:codex/fix-windows-updater-path-normalization

Conversation

@zouyonghe
Copy link
Copy Markdown
Member

@zouyonghe zouyonghe commented May 5, 2026

Summary

This fixes Windows updater failures caused by using the zip archive root entry directly as a filesystem path fragment during extraction.

On Windows, GitHub zipballs expose root entries like AstrBotDevs-AstrBot-39386ee/. When the updater joins that POSIX-style fragment onto an extended-length path such as \\?\C:\Users\..., downstream filesystem calls can receive mixed separators and fail with WinError 123. Users then cannot update either AstrBot Core or plugins from the dashboard.

The root cause was that both astrbot/core/zip_updator.py and astrbot/core/star/updator.py reused z.namelist()[0] directly for os.listdir(), shutil.move(), and cleanup paths. That made the updater depend on archive entry formatting instead of normalizing to the local platform path semantics first.

This change adds a small shared normalization step in RepoZipUpdator and reuses the normalized root path for all downstream filesystem operations in both update flows. The fix stays narrow and does not change download behavior or the overall extraction flow.

Testing

I added regression coverage for both updater paths and verified that the tests fail before the fix and pass after it.

Commands used:

  • uv run pytest tests/test_updator_socks.py -q
  • uv run ruff check astrbot/core/zip_updator.py astrbot/core/star/updator.py tests/test_updator_socks.py
  • uv run ruff format --check astrbot/core/zip_updator.py astrbot/core/star/updator.py tests/test_updator_socks.py

Notes

The new tests are unit-level and mocked, so the remaining residual risk is that they do not exercise a real Windows filesystem end to end. They do verify the specific behavior that was broken here: normalizing extended-length Windows paths before listdir, move, and cleanup reuse.

Summary by Sourcery

Normalize archive root directories when extracting GitHub zipballs to fix Windows updater path handling for core and plugin updates.

Bug Fixes:

  • Fix Windows updater failures caused by using raw zip archive root entries with mixed path separators during extraction for core updates.
  • Fix plugin updater extraction on Windows by reusing a normalized archive root path for filesystem operations.

Tests:

  • Add regression tests verifying that repo and plugin zip extraction normalize Windows extended-length paths before listdir, move, and cleanup.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves path normalization during the unzipping process in both PluginUpdator and RepoZipUpdator by utilizing os.path.normpath and a new _normalize_archive_root_dir method. These changes, supported by new test cases, specifically address path handling issues on Windows, including support for extended length paths. Review feedback suggests narrowing the exception handling from BaseException to Exception during the cleanup phase to avoid catching system-exiting signals like KeyboardInterrupt.

shutil.rmtree(os.path.join(target_dir, update_dir), onerror=on_error)
shutil.rmtree(update_root_path, onerror=on_error)
os.remove(zip_path)
except BaseException:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching BaseException is overly broad as it includes system-exiting exceptions like SystemExit and KeyboardInterrupt. It is recommended to catch Exception instead to handle standard runtime errors during cleanup.

Suggested change
except BaseException:
except Exception:

@zouyonghe zouyonghe mentioned this pull request May 5, 2026
2 tasks
…-updater-path-normalization

# Conflicts:
#	astrbot/core/star/updator.py
@zouyonghe zouyonghe changed the title [codex] fix windows updater zip root path normalization fix windows updater zip root path normalization May 5, 2026
@zouyonghe zouyonghe marked this pull request as ready for review May 5, 2026 15:31
@auto-assign auto-assign Bot requested a review from LIghtJUNction May 5, 2026 15:31
@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label May 5, 2026
@auto-assign auto-assign Bot requested a review from Raven95676 May 5, 2026 15:31
@dosubot dosubot Bot added area:core The bug / feature is about astrbot's core, backend feature:updater The bug / feature is about astrbot updater system labels May 5, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Consider moving _normalize_archive_root_dir into a shared base class or utility used by both repo and plugin updaters so the normalization logic is defined in one place and you don’t rely on each class defining its own helper.
  • In both unzip_file implementations you repeatedly call os.path.normpath inside the loop; you could simplify and reduce redundancy by computing normalized Path objects once (e.g., via pathlib.Path) and reusing them for item paths.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider moving `_normalize_archive_root_dir` into a shared base class or utility used by both repo and plugin updaters so the normalization logic is defined in one place and you don’t rely on each class defining its own helper.
- In both `unzip_file` implementations you repeatedly call `os.path.normpath` inside the loop; you could simplify and reduce redundancy by computing normalized `Path` objects once (e.g., via `pathlib.Path`) and reusing them for item paths.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@zouyonghe
Copy link
Copy Markdown
Member Author

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In RepoZipUpdator._normalize_archive_root_dir, referencing RepoZipUpdator._normalize_archive_path directly makes it harder to override in subclasses; consider using a @classmethod with cls._normalize_archive_path or extracting these helpers to a shared module-level function for better extensibility.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `RepoZipUpdator._normalize_archive_root_dir`, referencing `RepoZipUpdator._normalize_archive_path` directly makes it harder to override in subclasses; consider using a `@classmethod` with `cls._normalize_archive_path` or extracting these helpers to a shared module-level function for better extensibility.

## Individual Comments

### Comment 1
<location path="tests/test_updator_socks.py" line_range="429-438" />
<code_context>
+    archive_root = "AstrBotDevs-AstrBot-39386ee/"
</code_context>
<issue_to_address>
**suggestion (testing):** Add coverage for archive roots without a trailing slash or with nested folder names to fully exercise normalization

These tests currently assume `namelist()[0]` is a single-level folder with a trailing slash (e.g., `"AstrBotDevs-AstrBot-39386ee/"`). Since `_normalize_archive_root_dir` is meant to harden path handling, please add coverage for: (1) `"AstrBotDevs-AstrBot-39386ee"` (no trailing slash), (2) nested roots like `"owner-repo-branch/subdir/"`, and (3) a case where the root normalizes to `"."` so the empty-string fallback is exercised. This will better lock in the normalization semantics and prevent regressions in zip root handling.

Suggested implementation:

```python
@pytest.mark.parametrize(
    "archive_root",
    [
        # Trailing slash single-level folder
        "AstrBotDevs-AstrBot-39386ee/",
        # Same folder without trailing slash
        "AstrBotDevs-AstrBot-39386ee",
        # Nested archive root
        "owner-repo-branch/subdir/",
        # Archive root that normalizes to "." so the empty-string fallback is exercised
        ".",
    ],
)
def test_repo_unzip_file_normalizes_windows_extended_length_paths(
    monkeypatch: pytest.MonkeyPatch,
    archive_root: str,
) -> None:

```

```python
    target_dir = r"\\?\\C:\\Users\\admin\\AppData\\Local\\AstrBot\\backend\\app"

```
</issue_to_address>

### Comment 2
<location path="astrbot/core/zip_updator.py" line_range="234" />
<code_context>
             return author, repo, branch
         raise ValueError("无效的 GitHub URL")

+    @staticmethod
+    def _normalize_archive_path(*parts: str) -> str:
+        return os.path.normpath(os.path.join(*parts))
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new path-normalization logic by removing the extra helper and intermediate variables while preserving the updated handling of '.' and archive roots.

The new behavior around `"."` and path normalization is useful, but you can get it with less indirection by:

* Dropping `_normalize_archive_path`
* Keeping a single root-normalization helper
* Using direct `os.path.join` calls instead of multiple intermediate path variables

For example:

```python
@staticmethod
def _normalize_archive_root_dir(entry: str) -> str:
    """Normalize the first zip entry; treat '.' as empty root."""
    update_dir = os.path.normpath(entry)
    return "" if update_dir == "." else update_dir
```

Then `unzip_file` can stay close to the original structure while preserving your new behavior:

```python
def unzip_file(self, zip_path: str, target_dir: str) -> None:
    """解压缩文件, 并将压缩包内**第一个**文件夹内的文件移动到 target_dir"""
    ensure_dir(target_dir)
    update_dir = ""
    with zipfile.ZipFile(zip_path, "r") as z:
        update_dir = self._normalize_archive_root_dir(z.namelist()[0])
        z.extractall(target_dir)
    logger.debug(f"解压文件完成: {zip_path}")

    update_root = os.path.join(target_dir, update_dir)
    files = os.listdir(update_root)
    for f in files:
        update_item_path = os.path.join(update_root, f)
        target_item_path = os.path.join(target_dir, f)

        if os.path.isdir(update_item_path):
            if os.path.exists(target_item_path):
                shutil.rmtree(target_item_path, onerror=on_error)
        elif os.path.exists(target_item_path):
            os.remove(target_item_path)

        shutil.move(update_item_path, target_dir)

    try:
        logger.debug(f"删除临时更新文件: {zip_path}{update_root}")
        shutil.rmtree(update_root, onerror=on_error)
        os.remove(zip_path)
    except BaseException:
        logger.warning(
            f"删除更新文件失败,可以手动删除 {zip_path}{update_root}",
        )
```

This keeps the improved normalization and `"."` handling, but reduces the number of helper methods and intermediate variables, making `unzip_file` easier to scan and reason about.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tests/test_updator_socks.py Outdated
Comment on lines +429 to +438
archive_root = "AstrBotDevs-AstrBot-39386ee/"
expected_root = ntpath.normpath(ntpath.join(target_dir, archive_root))
expected_file = ntpath.normpath(
ntpath.join(target_dir, archive_root, ".dockerignore")
)
captured: dict[str, object] = {}

def fake_listdir(path: str) -> list[str]:
captured.setdefault("listdir", path)
return [".dockerignore"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add coverage for archive roots without a trailing slash or with nested folder names to fully exercise normalization

These tests currently assume namelist()[0] is a single-level folder with a trailing slash (e.g., "AstrBotDevs-AstrBot-39386ee/"). Since _normalize_archive_root_dir is meant to harden path handling, please add coverage for: (1) "AstrBotDevs-AstrBot-39386ee" (no trailing slash), (2) nested roots like "owner-repo-branch/subdir/", and (3) a case where the root normalizes to "." so the empty-string fallback is exercised. This will better lock in the normalization semantics and prevent regressions in zip root handling.

Suggested implementation:

@pytest.mark.parametrize(
    "archive_root",
    [
        # Trailing slash single-level folder
        "AstrBotDevs-AstrBot-39386ee/",
        # Same folder without trailing slash
        "AstrBotDevs-AstrBot-39386ee",
        # Nested archive root
        "owner-repo-branch/subdir/",
        # Archive root that normalizes to "." so the empty-string fallback is exercised
        ".",
    ],
)
def test_repo_unzip_file_normalizes_windows_extended_length_paths(
    monkeypatch: pytest.MonkeyPatch,
    archive_root: str,
) -> None:
    target_dir = r"\\?\\C:\\Users\\admin\\AppData\\Local\\AstrBot\\backend\\app"

Comment thread astrbot/core/zip_updator.py Outdated
@zouyonghe
Copy link
Copy Markdown
Member Author

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The _normalize_archive_root_dir logic is only defined on RepoZipUpdator but is now also used in the plugin updater; consider moving this helper into a shared utility or explicit base class to make the dependency between the two updaters clearer and avoid relying on implicit inheritance/import structure.
  • In the unzip flows you now call os.path.normpath both inside _normalize_archive_root_dir and again when building update_root_path; if the second normalization is not strictly needed, removing it would simplify the path handling and reduce confusion about where normalization is expected to happen.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `_normalize_archive_root_dir` logic is only defined on `RepoZipUpdator` but is now also used in the plugin updater; consider moving this helper into a shared utility or explicit base class to make the dependency between the two updaters clearer and avoid relying on implicit inheritance/import structure.
- In the unzip flows you now call `os.path.normpath` both inside `_normalize_archive_root_dir` and again when building `update_root_path`; if the second normalization is not strictly needed, removing it would simplify the path handling and reduce confusion about where normalization is expected to happen.

## Individual Comments

### Comment 1
<location path="astrbot/core/zip_updator.py" line_range="244-248" />
<code_context>
         logger.info(f"Extracting archive: {zip_path}")
         with zipfile.ZipFile(zip_path, "r") as z:
-            update_dir = z.namelist()[0]
+            update_dir = self._normalize_archive_root_dir(z.namelist()[0])
             z.extractall(target_dir)

-        files = os.listdir(os.path.join(target_dir, update_dir))
+        update_root_path = os.path.normpath(os.path.join(target_dir, update_dir))
+        files = os.listdir(update_root_path)
         for f in files:
</code_context>
<issue_to_address>
**issue (bug_risk):** Handle the case where `_normalize_archive_root_dir` returns an empty string to avoid deleting/moving from `target_dir` itself.

If `_normalize_archive_root_dir` returns `""` (e.g., first entry normalizes to `"."`), `update_root_path` becomes `target_dir`. Then `os.listdir(update_root_path)` iterates `target_dir` directly, and for each entry you `shutil.rmtree(target_item_path)` and then `shutil.move(update_item_path, target_dir)` where `update_item_path == target_item_path` and has just been deleted. This both breaks the move and removes the extracted content. Please guard this logic when `update_dir` is empty (or `update_root_path == target_dir`), e.g., by skipping the move/delete loop in that case.
</issue_to_address>

### Comment 2
<location path="astrbot/core/star/updator.py" line_range="78-81" />
<code_context>
         logger.info(f"Extracting archive: {zip_path}")
         with zipfile.ZipFile(zip_path, "r") as z:
-            update_dir = z.namelist()[0]
+            update_dir = self._normalize_archive_root_dir(z.namelist()[0])
             z.extractall(target_dir)

-        files = os.listdir(os.path.join(target_dir, update_dir))
+        update_root_path = os.path.normpath(os.path.join(target_dir, update_dir))
+        files = os.listdir(update_root_path)
         for f in files:
</code_context>
<issue_to_address>
**issue (bug_risk):** Apply a similar guard in the star updater when the normalized archive root is empty.

Because `_normalize_archive_root_dir` can return an empty string, `update_root_path` may equal `target_dir`. In that case the loop will operate on `target_dir` itself, deleting its contents and then trying to move files from paths that no longer exist, potentially removing files that should remain. Adding a guard (e.g. `if not update_dir: return` or skipping the move/delete loop when `update_root_path == target_dir`) avoids this while still supporting archives whose top-level entry is a real subdirectory.
</issue_to_address>

### Comment 3
<location path="tests/test_updator_socks.py" line_range="454" />
<code_context>
+        captured.setdefault("listdir", path)
+        return [".dockerignore"]
+
+    monkeypatch.setattr(
+        zip_updator_module.os, "makedirs", lambda path, exist_ok=True: None
+    )
</code_context>
<issue_to_address>
**suggestion (testing):** Assert that the zip file itself is removed, or avoid capturing it if you don’t use it

In `test_repo_unzip_file_normalizes_windows_extended_length_paths`, `os.remove` is monkeypatched to capture the "removed" path, but this value is never asserted. Please either assert that `captured["removed"] == "temp.zip"` to confirm the archive is cleaned up, or remove the monkeypatch and capture if it’s not needed.

Suggested implementation:

```python
    expected_root = ntpath.normpath(ntpath.join(target_dir, archive_root))
    expected_file = ntpath.normpath(
        ntpath.join(target_dir, archive_root, ".dockerignore")
    )
    # used to capture side effects like listdir path and removed zip archive
    captured: dict[str, object] = {"removed": None}

```

In `test_repo_unzip_file_normalizes_windows_extended_length_paths`:

1. Ensure `os.remove` is monkeypatched to populate `captured["removed"]`, e.g.:

   ```python
   monkeypatch.setattr(
       zip_updator_module.os,
       "remove",
       lambda path: captured.__setitem__("removed", path),
   )
   ```

2. After the function under test has been invoked and any other assertions on `expected_root` / `expected_file` are made, add:

   ```python
   assert captured["removed"] == "temp.zip"
   ```

If you prefer not to assert on removal, instead delete the `os.remove` monkeypatch and any use of `captured["removed"]` so the unnecessary capture is removed entirely.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/core/zip_updator.py Outdated
Comment thread astrbot/core/star/updator.py Outdated
captured.setdefault("listdir", path)
return [".dockerignore"]

monkeypatch.setattr(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Assert that the zip file itself is removed, or avoid capturing it if you don’t use it

In test_repo_unzip_file_normalizes_windows_extended_length_paths, os.remove is monkeypatched to capture the "removed" path, but this value is never asserted. Please either assert that captured["removed"] == "temp.zip" to confirm the archive is cleaned up, or remove the monkeypatch and capture if it’s not needed.

Suggested implementation:

    expected_root = ntpath.normpath(ntpath.join(target_dir, archive_root))
    expected_file = ntpath.normpath(
        ntpath.join(target_dir, archive_root, ".dockerignore")
    )
    # used to capture side effects like listdir path and removed zip archive
    captured: dict[str, object] = {"removed": None}

In test_repo_unzip_file_normalizes_windows_extended_length_paths:

  1. Ensure os.remove is monkeypatched to populate captured["removed"], e.g.:

    monkeypatch.setattr(
        zip_updator_module.os,
        "remove",
        lambda path: captured.__setitem__("removed", path),
    )
  2. After the function under test has been invoked and any other assertions on expected_root / expected_file are made, add:

    assert captured["removed"] == "temp.zip"

If you prefer not to assert on removal, instead delete the os.remove monkeypatch and any use of captured["removed"] so the unnecessary capture is removed entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend feature:updater The bug / feature is about astrbot updater system size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant