Skip to content

fix(core): invalidate config cache when file is modified by another process#662

Merged
groksrc merged 3 commits intomainfrom
fix/660-config-cache-staleness
Mar 11, 2026
Merged

fix(core): invalidate config cache when file is modified by another process#662
groksrc merged 3 commits intomainfrom
fix/660-config-cache-staleness

Conversation

@groksrc
Copy link
Member

@groksrc groksrc commented Mar 10, 2026

Summary

  • Add mtime-based cache validation to ConfigManager.load_config() so long-lived processes (MCP stdio server) detect external config changes
  • Track _CONFIG_MTIME alongside _CONFIG_CACHE — cheap os.stat() per access, re-read only when mtime differs
  • Reset both cache vars in save_config()
  • Update all test fixtures that clear _CONFIG_CACHE to also clear _CONFIG_MTIME

Previously, running bm project set-cloud in a terminal had no effect on the MCP server because the config was cached at the module level and never re-read from disk.

Test plan

  • New tests: cache hit when unchanged, cache miss when mtime changes, save_config resets mtime
  • 366 existing tests pass (CLI + config + project service)
  • ruff check, ruff format, pyright all clean

Fixes #660

🤖 Generated with Claude Code

@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

Claude finished @groksrc's task in 2m 41s —— View job


PR Review: Invalidate config cache when file is modified by another process

Tasks:

  • Gather context and read PR changes
  • Review src/basic_memory/config.py changes
  • Review test files
  • Check against team checklist
  • Post final review

Summary

This PR correctly addresses a real-world staleness problem for long-lived processes (MCP stdio server). The approach is sound and implementation is clean. A few observations worth noting below.


Code Quality

src/basic_memory/config.py

The mtime-based cache validation is a well-chosen, low-overhead solution. A few notes:

  1. Literate style: excellent. The Trigger/Why/Outcome comment pattern is followed precisely:

    # Trigger: cached config exists but the on-disk file may have been
    # modified by another process (CLI command in a different terminal).
    # Why: the MCP server is long-lived; without this check it would
    # serve stale project routing forever.
    # Outcome: cheap os.stat() per access; re-read only when mtime differs.
  2. Minor: _STALE_KEYS defined inside load_config(). This set is re-created on every invocation (including cache hits, since the check happens before the early return). Since cache hits return early before reaching it, it only matters on cache misses — still slightly wasteful. Could be a module-level constant.

    # Currently on line 699, inside load_config(), after cache check:
    _STALE_KEYS = {
        "default_project_mode",
        "project_modes",
        "cloud_projects",
        "cloud_mode",
    }

    Low priority, but easy fix. Fix this →

  3. TOCTOU (Time-of-Check-Time-of-Use) race window. The file mtime is read twice: once on line 683 (to check against cached mtime) and again on line 749 (to record after reading). If another process writes the file between these two points, the cache will store old content with a new mtime — future calls will see mtime unchanged and serve the stale content. This is:

    • Standard for mtime-based caches
    • Extremely rare in practice
    • Self-correcting on the next external write
    • Acceptable for this use case
  4. OSError handling on stat (lines 683–685) is correct. When the file can't be stat'd (deleted, permission denied), current_mtime = None causes cache invalidation and falls through to re-read. This is the right behavior.

  5. save_config() correctly resets both globals (lines 783–787). The reset-before-return pattern (rather than update-in-place) is safe and correct.

  6. _CONFIG_MTIME is not populated when config file doesn't exist (lines 776–779). When the file is absent, a default config is created, save_config() is called (which zeroes both globals), and the local config is returned uncached. The next call will re-read the freshly written file and populate both _CONFIG_CACHE and _CONFIG_MTIME. This is correct — no change needed.


Tests

tests/test_config.py — New tests are solid:

  • test_cache_hit_when_unchanged: Verifies same object identity on repeat calls. ✅
  • test_cache_invalidated_when_file_modified: Uses os.utime() to force mtime change on coarse-granularity filesystems — robust approach. ✅
  • test_save_config_resets_mtime: Directly inspects module globals after save. ✅

Fixture updates — All conftest files that previously cleared _CONFIG_CACHE now also clear _CONFIG_MTIME. Cross-checked against all 6 modified conftest/test files. ✅

One observation on test isolation: Several tests in TestBasicMemoryConfig (e.g., test_stale_default_project_loaded_from_file) manually set _CONFIG_CACHE = None and _CONFIG_MTIME = None at the start but do not restore them after. The populated cache from load_config() within those tests will persist to subsequent tests. Since conftest.py's config_manager fixture clears these globals, tests using that fixture are protected — but free-standing tests within TestConfigManager that don't use config_manager could be affected by leftover state from TestBasicMemoryConfig tests that ran before them. This is a pre-existing pattern, not a regression introduced by this PR.


Team Checklist

Category Item Status
Code Quality Follows Basic Memory coding conventions (literate style, Trigger/Why/Outcome)
Python 3.12+ type annotations
SQLAlchemy 2.0 / FastAPI / Typer conventions N/A
100-character line length
No commented-out code
Testing Unit tests for new functions
Integration tests for MCP tools N/A
Edge cases covered (file gone, mtime change, save reset)
100% test coverage maintained
Documentation updated N/A
Architecture Preserves local-first principles
No DB schema changes (no migrations needed)
Security No hardcoded secrets
Proper error handling (OSError caught)
No sensitive data in logs
Compatibility Windows-compatible path/mtime ops (os.stat())
No emoji/unicode in console or log output

Verdict

Approve. The implementation is correct, clean, and well-tested. The single notable edge case (TOCTOU race) is acceptable and standard for mtime-based caches. The _STALE_KEYS placement is a minor style nit that can be addressed as a follow-up if desired.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 089d7b1bb6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +687 to +688
if current_mtime is not None and current_mtime == _CONFIG_MTIME:
return _CONFIG_CACHE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use stronger change detection than second-level mtime

The cache hit check relies on current_mtime == _CONFIG_MTIME, so on filesystems with coarse timestamp granularity (the tests already acknowledge 1s granularity) a cross-process config write that happens within the same timestamp tick will not invalidate the cache and the process will keep serving stale config. This means long-lived MCP processes can still miss project set-cloud/set-local updates in the exact workflow this patch is trying to fix.

Useful? React with 👍 / 👎.

@groksrc groksrc force-pushed the fix/660-config-cache-staleness branch from 089d7b1 to 49897a6 Compare March 10, 2026 18:47
@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

Claude finished @groksrc's task in 2m 27s —— View job


PR Review: Invalidate config cache when file is modified by another process

Tasks:

  • Gather context and understand PR
  • Read src/basic_memory/config.py changes
  • Review test files
  • Check against team checklist
  • Post final review

Summary

The approach is correct and well-suited for the problem. An mtime-based cache guard is a low-overhead, platform-compatible solution for cross-process config invalidation. The implementation is clean and the comment style follows the Trigger/Why/Outcome convention from CLAUDE.md. One potential test regression was found.


Code Quality

src/basic_memory/config.py

The core logic at lines 681–692 is correct:

if _CONFIG_CACHE is not None:
    try:
        current_mtime = self.config_file.stat().st_mtime
    except OSError:
        current_mtime = None

    if current_mtime is not None and current_mtime == _CONFIG_MTIME:
        return _CONFIG_CACHE

    # mtime changed or file gone — invalidate and fall through to re-read
    _CONFIG_CACHE = None
    _CONFIG_MTIME = None
  • OSError guard (line 684) correctly handles deleted/inaccessible files, falling through to a re-read. ✅
  • save_config() resetting both globals at lines 786–787 is correct. ✅
  • Module-level comment on _CONFIG_MTIME (line 632–634) explains the "why" well. ✅
  • Windows compatibility: os.stat().st_mtime is standard on Windows. ✅

Minor nit: _STALE_KEYS is defined inside load_config() (line 699). This is re-created on every cache miss. Since it's a constant, it could be a module-level set. Low priority.

TOCTOU window (lines 683 vs 749): The file mtime is read twice — once to check the cache and once to record after re-reading. A concurrent write between these two calls would store stale content with a fresh mtime. This is standard and self-correcting on the next external write. Acceptable for this use case.

Codex's coarse-granularity concern (P2): Filesystems with 1-second mtime resolution (FAT32, some network mounts) could miss a write that lands within the same second. The test already handles this with os.utime() forcing mtime +1. In practice, the primary target (MCP stdio server detecting bm project set-cloud) involves human-driven operations with well-separated timestamps. Acceptable risk.


Potential Test Regression

tests/mcp/test_tool_write_note.py lines 1259 and 1283 — This file was not updated in this PR, but the new mtime logic changes its behavior:

# line 1259 - injects a modified config directly
config_module._CONFIG_CACHE = app_config   # _CONFIG_MTIME is still None

With the new code, any subsequent load_config() call will:

  1. See _CONFIG_CACHE is not None → enter the mtime check block
  2. Get a valid current_mtime from stat()
  3. Evaluate current_mtime == _CONFIG_MTIMEfloat == NoneFalse
  4. Invalidate the cache and re-read from disk — discarding the injected app_config

The test injects app_config.write_note_overwrite_default = True to test overwrite behavior. If the injected config is silently discarded before write_note() reads it, the test may be testing the wrong code path.

This test passes today only if write_note() doesn't call load_config() in the hot path during these tests (e.g., if the config is read earlier via the app container fixture and cached at the API layer). If that assumption ever changes, this test will silently regress.

Recommended fix: Update test_tool_write_note.py to set _CONFIG_MTIME alongside _CONFIG_CACHE when injecting config directly:

config_module._CONFIG_CACHE = app_config
config_module._CONFIG_MTIME = float("inf")  # prevent mtime invalidation

Or better, use save_config() + load_config() to go through the proper path. Fix this →


Tests

  • test_cache_hit_when_unchanged: Same-object identity check. ✅
  • test_cache_invalidated_when_file_modified: Uses os.utime() to force mtime change — robust across coarse-granularity filesystems. ✅
  • test_save_config_resets_mtime: Directly inspects module globals. ✅
  • All 6 conftest/fixture files updated to clear _CONFIG_MTIME alongside _CONFIG_CACHE. ✅

Team Checklist

Category Item Status
Code Quality Follows Basic Memory coding conventions (literate style, Trigger/Why/Outcome)
Python 3.12+ type annotations
100-character line length
No commented-out code
Testing Unit tests for new functions
Edge cases covered (file gone, mtime change, save reset)
100% test coverage maintained
Pre-existing test isolation concern in TestBasicMemoryConfig ⚠️ (pre-existing, not a regression)
Architecture Preserves local-first principles
No DB schema changes (no migrations needed)
Security No hardcoded secrets
Proper error handling (OSError caught)
No sensitive data in logs
Compatibility Windows-compatible os.stat().st_mtime
No emoji/unicode in console or log output

Verdict

Implementation is correct and the approach is well-chosen. The one item worth addressing before merge is the potential test regression in test_tool_write_note.py — updating lines 1259 and 1283 to also set _CONFIG_MTIME (or using float("inf") to prevent invalidation) would make the test intent robust against the new mtime guard.

@groksrc groksrc changed the title Invalidate config cache when file is modified by another process fix(core): invalidate config cache when file is modified by another process Mar 10, 2026
groksrc and others added 2 commits March 10, 2026 14:16
Add mtime-based cache validation so long-lived processes (like the MCP
stdio server) detect when the config file has been modified externally
(e.g. by `bm project set-cloud` in a separate terminal). This is a
cheap os.stat() call per config access that only triggers a re-read
when the file mtime has actually changed.

Fixes #660

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Drew Cain <groksrc@gmail.com>
Addresses Codex review feedback: on filesystems with coarse mtime
granularity (1s resolution), two writes within the same second would
share the same mtime. Adding file size to the check catches config
changes even when mtime doesn't change, since config edits almost
always change file size.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Drew Cain <groksrc@gmail.com>
@groksrc groksrc force-pushed the fix/660-config-cache-staleness branch from 49897a6 to b3fc169 Compare March 10, 2026 19:16
@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

Claude finished @groksrc's task in 3m 34s —— View job


PR Review: Invalidate config cache when file is modified by another process

Tasks:

  • Gather context and read PR changes
  • Review src/basic_memory/config.py changes
  • Review test files
  • Check tests/mcp/test_tool_write_note.py for potential regression
  • Check against team checklist
  • Post final review

Summary

This PR adds mtime+size-based cache invalidation to ConfigManager.load_config() — a correct, low-overhead fix for the config staleness problem affecting long-lived MCP processes. The latest commit (b3fc169) upgraded from mtime-only to mtime+size, which addresses the coarse filesystem granularity concern raised by the Codex review. Most of the implementation is solid, but there is one confirmed regression in the test suite that must be fixed before merge.


Confirmed Test Regression

tests/mcp/test_tool_write_note.py lines 1259 and 1283 — These lines inject a modified config by setting config_module._CONFIG_CACHE = app_config but do not set _CONFIG_MTIME or _CONFIG_SIZE.

The write_note tool directly reads write_note_overwrite_default from config at runtime (src/basic_memory/mcp/tools/write_note.py:148):

overwrite if overwrite is not None else ConfigManager().config.write_note_overwrite_default

ConfigManager().config calls load_config(). With the new cache guard:

  1. _CONFIG_CACHE is not None → enters mtime check block
  2. stat() returns a valid mtime from disk
  3. current_mtime == _CONFIG_MTIMEfloat == NoneFalse
  4. Cache invalidated, re-read from disk — injected app_config is silently discarded
  5. write_note_overwrite_default reads as False instead of the injected True
  6. The overwrite test asserts "# Updated note" in result — but now it gets a NOTE_ALREADY_EXISTS conflict response

This is a confirmed regression that will cause test_write_note_config_overwrite_default_true to fail. The fix is to set all three globals when injecting the cache:

# Line 1259 and 1283 — add _CONFIG_MTIME and _CONFIG_SIZE alongside _CONFIG_CACHE
config_module._CONFIG_CACHE = app_config
config_module._CONFIG_MTIME = float("inf")   # prevents mtime invalidation
config_module._CONFIG_SIZE = 0               # prevents size invalidation

Fix this →


Code Quality

src/basic_memory/config.py

The mtime+size dual-check is well-chosen — guarding against coarse-granularity filesystems where two writes within the same second share the same mtime. The Trigger/Why/Outcome comment style is followed correctly:

# Outcome: cheap os.stat() per access; re-read only when mtime or size differs.

A few minor notes:

  1. _STALE_KEYS is still defined inside load_config() (line 709). It's a constant set that's re-created on every cache miss. Moving it to module level is a trivial cleanup. Low priority.

    Fix this →

  2. TOCTOU race (lines 685 vs 759): stat() is called twice — once to check the cache, once to record after re-reading. A concurrent write between these two calls could store stale content with a fresh mtime/size. This is standard for mtime-based caches, self-correcting on the next external write, and acceptable for this use case.

  3. OSError handling (lines 688–690) is correct. File gone → current_mtime = None → cache invalidated → falls through to re-read. ✅

  4. save_config() resets all three globals (lines 799–801). Correct. ✅


Tests

  • test_cache_returns_same_config_when_file_unchanged: Same-object identity check. ✅
  • test_cache_invalidated_when_file_modified: Uses os.utime() to force mtime change — robust across coarse-granularity filesystems. ✅
  • test_save_config_resets_mtime: Directly inspects all three module globals (_CONFIG_CACHE, _CONFIG_MTIME, _CONFIG_SIZE). ✅
  • All conftest/fixture files properly updated to clear _CONFIG_SIZE alongside _CONFIG_CACHE and _CONFIG_MTIME. ✅

Team Checklist

Category Item Status
Code Quality Follows Basic Memory coding conventions (literate style, Trigger/Why/Outcome)
Python 3.12+ type annotations
SQLAlchemy 2.0 / FastAPI / Typer conventions N/A
100-character line length
No commented-out code
Testing Unit tests for new functions
Integration tests for MCP tools N/A
Edge cases covered (file gone, mtime+size change, save reset)
100% test coverage maintained
test_tool_write_note.py cache injection missing _CONFIG_MTIME/_CONFIG_SIZE ❌ (regression)
Architecture Preserves local-first principles
No DB schema changes (no migrations needed)
Security No hardcoded secrets
Proper error handling (OSError caught)
No sensitive data in logs
Compatibility Windows-compatible path/mtime ops (os.stat())
No emoji/unicode in console or log output

Verdict

Request changes. One confirmed regression: tests/mcp/test_tool_write_note.py injects _CONFIG_CACHE without setting _CONFIG_MTIME/_CONFIG_SIZE, which causes the new guard to immediately discard the injected config when write_note calls load_config() at runtime. Fix lines 1259 and 1283, then this is ready to merge.

…ests

The mtime+size cache guard introduced in this branch would discard
injected _CONFIG_CACHE values because float("inf") never matches any
real file mtime. Use the actual config file's mtime and size so the
guard sees a match and keeps the injected config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Drew Cain <groksrc@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

Claude finished @groksrc's task in 1m 42s —— View job


PR Review: Invalidate config cache when file is modified by another process

Tasks:

  • Gather context and understand PR
  • Read current state of src/basic_memory/config.py
  • Check tests/mcp/test_tool_write_note.py for the regression fix (latest commit 6e7cfc7)
  • Review test files and conftest updates
  • Run through team checklist
  • Post final review

Summary

This PR correctly addresses the config staleness problem for long-lived processes (MCP stdio server). The three commits tell a clean story: initial mtime-only fix → upgrade to mtime+size for coarse-granularity filesystem safety → fix the cache injection pattern in write_note tests. All previously flagged issues have been resolved. This is ready to merge.


Code Quality: src/basic_memory/config.py

The implementation is sound. Key points:

  1. Trigger/Why/Outcome comment at lines 678–682 is correct and complete. Follows CLAUDE.md literate style precisely. ✅

  2. Dual mtime+size check (lines 692–696) correctly guards against coarse-granularity filesystems. The Codex P2 concern about 1-second mtime resolution is directly addressed. ✅

    if (
        current_mtime is not None
        and current_mtime == _CONFIG_MTIME
        and current_size == _CONFIG_SIZE
    ):
        return _CONFIG_CACHE
  3. OSError guard (lines 688–690) is correct. File gone or inaccessible → current_mtime = None → cache invalidated → falls through to re-read. ✅

  4. save_config() resets all three globals (lines 799–801).

  5. Minor nit — _STALE_KEYS is still defined inside load_config() (line 709). It's a constant set re-created on every cache miss. This is low priority and functionally harmless, but easy to clean up:
    Fix this →

  6. TOCTOU window between lines 685 and 759 (two stat() calls: one to check cache, one to record after re-read). Standard for mtime-based caches, self-correcting on next external write, acceptable here. ✅


Confirmed Regression Fix: tests/mcp/test_tool_write_note.py

The regression identified in previous reviews (cache injection without _CONFIG_MTIME/_CONFIG_SIZE) is fully resolved in commit 6e7cfc7. Lines 1259–1264 and 1288–1291 now pin the real stat values from disk:

config_module._CONFIG_CACHE = app_config
# Pin mtime+size to the on-disk file so the cache guard sees a match
# and keeps our injected config instead of re-reading from disk.
_st = config_manager.config_file.stat()
config_module._CONFIG_MTIME = _st.st_mtime
config_module._CONFIG_SIZE = _st.st_size

This is the correct pattern — using real stat values instead of a sentinel like float("inf") means the cache behaves exactly as production code would. The finally block also properly restores all three globals. ✅


Tests: tests/test_config.py

Three new tests are well-designed:

  • test_cache_returns_same_config_when_file_unchanged (line 1196): Same-object identity check confirms cache hit. ✅
  • test_cache_invalidated_when_file_modified (line 1225): Uses os.utime() to force mtime +1, making the test filesystem-agnostic. ✅
  • test_save_config_resets_mtime (line 1270): Directly inspects all three module globals before and after save_config(). ✅

All six conftest/fixture files properly clear _CONFIG_SIZE alongside _CONFIG_CACHE and _CONFIG_MTIME. Cross-checked: ✅


Team Checklist

Category Item Status
Code Quality Follows Basic Memory coding conventions (literate style, Trigger/Why/Outcome)
Python 3.12+ type annotations
SQLAlchemy 2.0 / FastAPI / Typer conventions N/A
100-character line length
No commented-out code
Testing Unit tests for new functions
Integration tests for MCP tools N/A
Edge cases covered (file gone, mtime+size change, save reset)
100% test coverage maintained
test_tool_write_note.py cache injection now sets all three globals ✅ (fixed in 6e7cfc7)
Architecture Preserves local-first principles
No DB schema changes (no migrations needed)
Security No hardcoded secrets
Proper error handling (OSError caught)
No sensitive data in logs
Compatibility Windows-compatible os.stat().st_mtime and st_size
No emoji/unicode in console or log output

Verdict

Approve. All previously raised concerns are resolved. The implementation is correct, the regression is fixed, and all tests properly track the full cache state. The only remaining item is the _STALE_KEYS nit (module-level constant vs. local variable) which is low priority and can be addressed as a follow-up.

@groksrc groksrc merged commit db60942 into main Mar 11, 2026
25 checks passed
@groksrc groksrc deleted the fix/660-config-cache-staleness branch March 11, 2026 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP stdio server does not pick up config changes from set-cloud/set-local

1 participant