Skip to content

[SM] Automatically prune cache entries older than 30 days#17585

Merged
AutomatedTester merged 4 commits into
trunkfrom
worktree-issue-17196-cache-expiry
Jun 8, 2026
Merged

[SM] Automatically prune cache entries older than 30 days#17585
AutomatedTester merged 4 commits into
trunkfrom
worktree-issue-17196-cache-expiry

Conversation

@AutomatedTester

Copy link
Copy Markdown
Member

Selenium Manager now removes driver/browser version directories from the cache (~/.cache/selenium) that have not been modified in over 30 days. The prune runs on every invocation after clear-cache/clear-metadata, so the cache size is bounded without requiring manual intervention.

Fixes #17196

🔗 Related Issues

💥 What does this PR do?

🔧 Implementation Notes

🤖 AI assistance

  • No substantial AI assistance used
  • AI assisted (complete below)
    • Tool(s):
    • What was generated:
    • I reviewed all AI output and can explain the change

💡 Additional Considerations

🔄 Types of changes

  • New feature (non-breaking change which adds functionality and tests!)

@selenium-ci selenium-ci added C-rust Rust code is mostly Selenium Manager B-manager Selenium Manager labels May 28, 2026

@bonigarcia bonigarcia left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation uses the modified metadata of the cache assets to select which assets to prune. But there are cases where these assets should still be in the cache, even if the field is older than 30 days. It can be any driver if it is still in use. For example, geckodriver has a very low release frequency (the last release was in February 2025).

I have something different in mind for this feature. In my head, the best solution to this problem is to add a field to the SM metadata (se-metadata.json) that is updated whenever an asset (driver or browser) is used (e.g., last_used or something similar). The metadata file is read by SM each time a driver/browser is resolved. At that moment, in addition to updating the existing info for the driver/browser and the new last_used, the rest of the assets (drivers/browsers) are checked and pruned if last_used is older than the current date - CACHE_TTL_DAYS.

@AutomatedTester

Copy link
Copy Markdown
Member Author

I forgot about how infrequent geckodriver releases are... I will update accordingly

Comment thread rust/src/metadata.rs
Selenium Manager now removes driver/browser version directories from the
cache (~/.cache/selenium) that have not been modified in over 30 days.
The prune runs on every invocation after clear-cache/clear-metadata, so
the cache size is bounded without requiring manual intervention.

Fixes #17196
Replace the file-mtime pruning approach with a metadata-driven one, as
suggested by @bonigarcia. Each Driver and Browser metadata entry now
carries a `last_used` Unix timestamp (defaulting to now for entries read
from older metadata files). The timestamp is refreshed whenever SM
resolves a driver or browser from the cache.

Pruning now reads se-metadata.json, removes entries whose `last_used` is
older than CACHE_TTL_DAYS (30), deletes the matching version directories
from disk, and writes the updated metadata back. This correctly handles
infrequently-released drivers like geckodriver that should not be pruned
just because they have an old mtime.
The previous approach stored last_used on the Driver/Browser structs,
which are subject to TTL filtering in get_metadata(). After 1 hour the
TTL entry expires; the next write_metadata call (triggered by any
driver/browser lookup) purges it from the file, discarding last_used
with it. As a result prune_old_cache_entries could never find old
entries to remove.

Fix: track usage in a new cached_assets section of se-metadata.json
that is never touched by the TTL retain logic. update_cached_asset
upserts a {asset_name, asset_version, last_used} record whenever a
driver or browser binary is served from the local cache. Pruning reads
only cached_assets, so it remains correct even after the short-lived
TTL entries have been flushed by unrelated SM invocations.
After deleting a stale version directory (e.g. chromedriver/linux/x86_64/120.0/),
walk upward and remove any ancestor directories that are now empty, stopping at
the cache root. This avoids leaving behind empty OS/arch scaffolding after a
driver or browser is pruned.
@AutomatedTester AutomatedTester force-pushed the worktree-issue-17196-cache-expiry branch from a89f7b8 to d7988b3 Compare June 8, 2026 13:58
@AutomatedTester AutomatedTester merged commit d33b389 into trunk Jun 8, 2026
63 checks passed
@AutomatedTester AutomatedTester deleted the worktree-issue-17196-cache-expiry branch June 8, 2026 19:41
@diemol

diemol commented Jun 22, 2026

Copy link
Copy Markdown
Member

Are the docs up to date with this feature?

@bonigarcia

Copy link
Copy Markdown
Member

I created a PR in the Selenium doc about it: SeleniumHQ/seleniumhq.github.io#2673

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

B-manager Selenium Manager C-rust Rust code is mostly Selenium Manager

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[🚀 Feature]: Automatically prune Selenium Manager cache

4 participants