Skip to content

perf(workspace): batch deps-cache invalidation into one workspace fs scan#10445

Open
davidfirst wants to merge 6 commits into
masterfrom
component-loading-batch-deps-invalidation
Open

perf(workspace): batch deps-cache invalidation into one workspace fs scan#10445
davidfirst wants to merge 6 commits into
masterfrom
component-loading-batch-deps-invalidation

Conversation

@davidfirst

@davidfirst davidfirst commented Jun 23, 2026

Copy link
Copy Markdown
Member

Part of the component-loading redesign (scopes/workspace/workspace/component-loading-redesign.md, Phase 2).

The deps fs-cache freshness check ran a recursive globby per component that followed each component's node_modules symlink into the shared workspace node_modules — 226k of 230k scanned entries, run 313× per command. This replaces it with a command-scoped mtime index built from a few batched scans, memoized on FsCache and invalidated through the workspace's existing clear-cache hooks (so watch/start stay correct), with a per-component fallback.

Restricting the node_modules traversal — correctness matters here. The cached "dependencies-data" is the auto-detect result, which resolves imports through node_modules and reads each direct dep's package.json (name/componentId), so the cache genuinely depends on node_modules content — it can't simply be ignored. But the dependency-tree builder stops at the package boundary (it never traverses into a package or the transitive store), so the only node_modules signal the cache needs is each component's node_modules and node_modules/@scope directory mtimes — a dep added / removed / version-relinked / componentId-changed all go through a relink that bumps those dirs. The deep tree is irrelevant. So the scan is: component source (excluding the deep node_modules subtree) + those dir mtimes.

Measured on this repo's workspace (~313 components): warm bit status filesystem syscalls 74.3k → 46.9k (~37% fewer); readFile traffic unchanged (checked against the bootstrap fs-read e2e metric). Correctness verified: source, node_modules, and @scope changes all invalidate the cache.

Warm wall is ~flat on a fast local SSD (the eliminated work is I/O-wait that overlaps with CPU on the single JS thread); the win lands on cold / CI / networked filesystems. §4.1 of the redesign doc has the full breakdown (and corrects an earlier "object materialization" misread — deserialize is 9ms).

…scan

The deps fs-cache freshness check ran a recursive globby per component that
followed each component's node_modules symlink into the shared workspace
node_modules (226k of 230k scanned entries), 313x per command.

Replace it with a single node_modules-ignoring workspace scan, memoized as a
command-scoped mtime index on FsCache and invalidated via the workspace's
clear-cache hooks (so watch stays correct), with a per-component fallback.
Cuts warm `bit status` fs syscalls ~40% (74.3k -> 44.8k); read traffic
unchanged. Warm-wall-neutral on fast SSD (I/O-wait overlapping CPU), a real
win on cold/CI/networked filesystems.
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 23, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📜 Skill insights (0)

Grey Divider


Action required

1. Symlinked scopes not tracked 🐞 Bug ≡ Correctness ⭐ New
Description
getLastModifiedDirTimestampMs() uses globby() with { onlyDirectories: true, followSymbolicLinks:
false } for ${rootDir}/node_modules/@*, which can omit @scope entries that are symbolic links;
changes under those scopes won’t affect the freshness signal and can incorrectly reuse stale
deps-cache.
Code

scopes/toolbox/fs/last-modified/last-modified.ts[R27-30]

+  const scopeDirs = await globby(`${rootDir}/node_modules/@*`, {
+    onlyDirectories: true,
+    followSymbolicLinks: false,
+  });
Evidence
The new scan only adds mtimes for node_modules and the node_modules/@* scope directories it
discovers. However, the dependency linker code explicitly allows node_modules entries to be
directories or symlinks and recurses into @scope entries, implying @scope can be a symlink in
real installs; if globby omits such symlinked scope entries, scoped changes may not invalidate the
deps-cache.

scopes/toolbox/fs/last-modified/last-modified.ts[17-32]
scopes/dependencies/dependency-resolver/dependency-linker.ts[361-382]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`getLastModifiedDirTimestampMs()` builds `scopeDirs` via `globby(`${rootDir}/node_modules/@*`, { onlyDirectories: true, followSymbolicLinks: false })`. If an `@scope` folder in `node_modules` is represented as a symlink, it may be excluded from `scopeDirs`, so changes inside that scope (e.g., relinks under `@scope/`) won't bump the computed last-modified timestamp and the deps fs-cache can be treated as fresh incorrectly.

### Issue Context
Elsewhere in the repo, node_modules entries (including scoped ones) are explicitly treated as potentially being either directories or symlinks (and scoped entries are recursed into), so the scan should include symlinked `@scope` roots as well.

### Fix Focus Areas
- scopes/toolbox/fs/last-modified/last-modified.ts[17-32]

### Suggested fix
Replace the `globby(`${rootDir}/node_modules/@*`, ...)` call with a non-recursive directory listing of `${rootDir}/node_modules`:
- `readdir(node_modulesPath, { withFileTypes: true })`
- filter entries whose name starts with `@` and where `dirent.isDirectory()` **or** `dirent.isSymbolicLink()`
- add `${rootDir}/node_modules/<scopeName>` paths to `scopeDirs`

This avoids deep traversal while still capturing symlinked scope directories.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Stuck index build promise ✓ Resolved 🐞 Bug ☼ Reliability
Description
FsCache.getOrBuildComponentsMtimeIndex() only clears componentsMtimeIndexBuilding on success, so
if the build rejects once the rejected promise is retained and all later callers will keep failing
in the same process. This can break deps-cache reads for long-lived commands (e.g. watch/start)
until restart.
Code

scopes/workspace/modules/fs-cache/fs-cache.ts[R33-45]

+  async getOrBuildComponentsMtimeIndex(build: () => Promise<Map<string, number>>): Promise<Map<string, number>> {
+    if (this.componentsMtimeIndex) return this.componentsMtimeIndex;
+    if (!this.componentsMtimeIndexBuilding) {
+      const gen = this.componentsMtimeIndexGen;
+      this.componentsMtimeIndexBuilding = build().then((index) => {
+        // if the index was cleared while building, don't cache this now-stale result as canonical.
+        if (gen === this.componentsMtimeIndexGen) this.componentsMtimeIndex = index;
+        this.componentsMtimeIndexBuilding = undefined;
+        return index;
+      });
+    }
+    return this.componentsMtimeIndexBuilding;
+  }
Evidence
The memoized promise is only cleared in the success path, so a rejection leaves
componentsMtimeIndexBuilding set forever. The build path can reject because getPathStatIfExist
rethrows non-ENOENT errors and globby(..., { stats: true }) can also reject.

scopes/workspace/modules/fs-cache/fs-cache.ts[33-45]
scopes/toolbox/fs/last-modified/last-modified.ts[35-41]
scopes/toolbox/fs/last-modified/last-modified.ts[82-100]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`FsCache.getOrBuildComponentsMtimeIndex()` memoizes the first build in `componentsMtimeIndexBuilding`, but it only resets that field in the `.then()` success handler. If the build throws/rejects (e.g. a filesystem stat error), the rejected promise remains stored and future calls cannot recover.
### Issue Context
The build function used by deps invalidation (`buildDirsLastModifiedIndex`) performs a large glob + many stats, and can reject on non-ENOENT filesystem errors.
### Fix Focus Areas
- scopes/workspace/modules/fs-cache/fs-cache.ts[33-45]
### Implementation notes
- Ensure `componentsMtimeIndexBuilding` is cleared in a `finally` (or a `.catch()` that rethrows) so a transient build failure doesn’t permanently poison the cache.
- Consider also logging the error at debug/trace level to aid diagnosing glob/stat failures.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Fallback scan wrong cwd 🐞 Bug ≡ Correctness
Description
DependenciesLoader.getComponentLastModified() builds the shared mtime index using workspace.path,
but its index-miss fallback calls getLastModifiedComponentTimestampMs(rootDir, ...) with a
workspace-relative rootDir while the underlying globby() directory scan is cwd-relative. If Bit is
invoked from a directory other than the workspace root, the fallback can compute an incorrect
last-modified time and incorrectly reuse stale deps cache, making invalidation depend on whether the
component hit the index or fell back.
Code

scopes/dependencies/dependencies/dependencies-loader/dependencies-loader.ts[R154-173]

+  private async getComponentLastModified(workspace: Workspace, rootDir: string): Promise<number> {
+    let index: Map<string, number> | undefined;
+    try {
+      index = await workspace.consumer.componentFsCache.getOrBuildComponentsMtimeIndex(() =>
+        buildComponentDirsLastModifiedIndex(
+          workspace.path,
+          workspace.consumer.bitMap.getAllComponents().map((componentMap) => componentMap.getComponentDir())
+        )
+      );
+    } catch (err: any) {
+      // a centralized scan failure (e.g. a filesystem error on one dir) shouldn't fail every
+      // component's load — fall back to the per-component scan below, preserving fault isolation.
+      this.logger.debug(`dependencies-loader, failed building the components mtime index: ${err?.message || err}`);
+    }
+    const fromIndex = index?.get(rootDir);
+    if (fromIndex !== undefined) return fromIndex;
+    const filesPaths = this.component.files.map((file) => file.path);
+    filesPaths.push(path.join(workspace.path, rootDir, COMPONENT_CONFIG_FILE_NAME));
+    const lastModified = await getLastModifiedComponentTimestampMs(rootDir, filesPaths);
+    index?.set(rootDir, lastModified);
Evidence
The index builder is explicitly rooted at workspace.path, but the fallback uses rootDir
directly; the underlying last-modified helper’s directory scan uses globby(rootDir) with no cwd,
meaning it is sensitive to process.cwd(). Elsewhere in deps resolution, the code explicitly avoids
relying on process.cwd() for path resolution, indicating non-root invocation is expected.

scopes/dependencies/dependencies/dependencies-loader/dependencies-loader.ts[154-174]
scopes/toolbox/fs/last-modified/last-modified.ts[12-20]
scopes/dependencies/dependencies/dependencies-loader/auto-detect-deps.ts[229-236]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`getComponentLastModified()` mixes two timestamp sources:
- index path: built with `workspace.path` as the scan base
- fallback path: calls `getLastModifiedComponentTimestampMs(rootDir, ...)` where `rootDir` is workspace-relative
`getLastModifiedComponentTimestampMs()` ultimately glob-scans directories via `globby(rootDir)` with no `cwd`, so `rootDir` is interpreted relative to `process.cwd()`. This can make the fallback return a wrong timestamp when Bit runs from a subdirectory, causing stale deps-cache reuse.
## Issue Context
This becomes a correctness/reliability issue specifically on index misses (e.g. after single-component cache invalidation during long-lived flows), because the index path and fallback path no longer agree on what `rootDir` is relative to.
## Fix Focus Areas
- scopes/dependencies/dependencies/dependencies-loader/dependencies-loader.ts[154-173]
### Suggested change
In the fallback path, resolve `rootDir` to an absolute path (or pass an explicit cwd) before calling `getLastModifiedComponentTimestampMs`.
Example approach:
- compute `const absRootDir = path.join(workspace.path, rootDir);`
- call `getLastModifiedComponentTimestampMs(absRootDir, filesPaths)`
- keep the index key as the original `rootDir` (relative) when doing `index?.set(rootDir, lastModified)`
(Alternative: update the toolbox last-modified helper to accept a `cwd` option and use it in `globby()`.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Fallback rescans node_modules ✓ Resolved 🐞 Bug ➹ Performance
Description
DependenciesLoader.getComponentLastModified() falls back to getLastModifiedComponentTimestampMs()
when an index entry is missing, but that per-component scan does not ignore node_modules and can
again traverse the component’s node_modules symlink tree. This means watch/single-component
invalidations (or any index-miss) can still pay the old worst-case traversal cost and behave
differently than the shared index path.
Code

scopes/dependencies/dependencies/dependencies-loader/dependencies-loader.ts[R154-174]

+  private async getComponentLastModified(workspace: Workspace, rootDir: string): Promise<number> {
+    let index: Map<string, number> | undefined;
+    try {
+      index = await workspace.consumer.componentFsCache.getOrBuildComponentsMtimeIndex(() =>
+        buildDirsLastModifiedIndex(
+          workspace.path,
+          workspace.consumer.bitMap.getAllComponents().map((componentMap) => componentMap.getComponentDir())
+        )
+      );
+    } catch (err: any) {
+      // a centralized scan failure (e.g. a filesystem error on one dir) shouldn't fail every
+      // component's load — fall back to the per-component scan below, preserving fault isolation.
+      this.logger.debug(`dependencies-loader, failed building the components mtime index: ${err?.message || err}`);
+    }
+    const fromIndex = index?.get(rootDir);
+    if (fromIndex !== undefined) return fromIndex;
+    const filesPaths = this.component.files.map((file) => file.path);
+    filesPaths.push(path.join(workspace.path, rootDir, COMPONENT_CONFIG_FILE_NAME));
+    const lastModified = await getLastModifiedComponentTimestampMs(rootDir, filesPaths);
+    index?.set(rootDir, lastModified);
+    return lastModified;
Evidence
The shared index path explicitly ignores node_modules, but the fallback calls the legacy
per-component timestamp function, whose internal directory scan has the node_modules ignore
commented out; therefore index misses can still traverse node_modules and pay the old cost.

scopes/dependencies/dependencies/dependencies-loader/dependencies-loader.ts[154-175]
scopes/toolbox/fs/last-modified/last-modified.ts[12-19]
scopes/toolbox/fs/last-modified/last-modified.ts[70-89]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`getComponentLastModified()` uses `buildDirsLastModifiedIndex()` (which ignores `node_modules`) for the shared index, but when the entry is missing it falls back to `getLastModifiedComponentTimestampMs()`, whose directory scan currently does **not** ignore `node_modules`. This reintroduces the expensive traversal on fallback and makes behavior inconsistent between index-hit and index-miss cases.
### Issue Context
The intent of the PR is to avoid following component `node_modules` symlinks into the workspace `node_modules` during deps-cache freshness checks.
### Fix Focus Areas
- scopes/dependencies/dependencies/dependencies-loader/dependencies-loader.ts[154-174]
- scopes/toolbox/fs/last-modified/last-modified.ts[12-19]
- scopes/toolbox/fs/last-modified/last-modified.ts[70-89]
### Suggested fix
Update the fallback path to use the same `node_modules`-ignoring logic as the shared index. Options:
1) In `getComponentLastModified()`, when `fromIndex` is missing, compute last-modified via `buildDirsLastModifiedIndex(workspace.path, [rootDir])` (same ignore defaults) and use that value; optionally also stat the component config explicitly if needed.
2) Alternatively, extend `getLastModifiedComponentTimestampMs()` / `getLastModifiedDirTimestampMs()` to accept an `ignore` list (defaulting to current behavior), and call it from the fallback with `['**/node_modules/**']` to match the batched scan semantics.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Inflight build invalidation race ✓ Resolved 🐞 Bug ≡ Correctness
Description
deleteComponentMtimeIndexEntry() only deletes from an already-built map and does not invalidate an
in-flight index build, so a build started before a watch-triggered cache clear can still be cached
as canonical afterward. This can cause deps-cache staleness checks to miss a just-changed component
and incorrectly reuse cached dependencies.
Code

scopes/workspace/modules/fs-cache/fs-cache.ts[R33-57]

+  async getOrBuildComponentsMtimeIndex(build: () => Promise<Map<string, number>>): Promise<Map<string, number>> {
+    if (this.componentsMtimeIndex) return this.componentsMtimeIndex;
+    if (!this.componentsMtimeIndexBuilding) {
+      const gen = this.componentsMtimeIndexGen;
+      this.componentsMtimeIndexBuilding = build().then((index) => {
+        // if the index was cleared while building, don't cache this now-stale result as canonical.
+        if (gen === this.componentsMtimeIndexGen) this.componentsMtimeIndex = index;
+        this.componentsMtimeIndexBuilding = undefined;
+        return index;
+      });
+    }
+    return this.componentsMtimeIndexBuilding;
+  }
+
+  /** drop the whole index (e.g. on a full workspace cache clear). */
+  clearComponentsMtimeIndex() {
+    this.componentsMtimeIndex = undefined;
+    this.componentsMtimeIndexBuilding = undefined;
+    this.componentsMtimeIndexGen += 1;
+  }
+
+  /** drop a single component's entry so its next load recomputes it (e.g. on a watch file change). */
+  deleteComponentMtimeIndexEntry(rootDir: string) {
+    this.componentsMtimeIndex?.delete(rootDir);
+  }
Evidence
Watch triggers workspace.clearComponentCache(), which deletes an index entry, but the delete
method only affects the already-built map. The build caching decision is based on the generation
captured at build start and delete does not bump it, so an in-flight build can still be accepted as
canonical.

scopes/workspace/watcher/watcher.ts[651-664]
scopes/workspace/workspace/workspace.ts[840-846]
scopes/workspace/modules/fs-cache/fs-cache.ts[33-45]
scopes/workspace/modules/fs-cache/fs-cache.ts[54-57]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`deleteComponentMtimeIndexEntry()` is intended to force recomputation for a component after `workspace.clearComponentCache()`, but it’s a no-op if the shared index hasn’t been materialized yet (or is still building). Because the in-flight build is still eligible to be cached (generation unchanged), it can reintroduce the stale entry.
### Issue Context
Watch flows call `workspace.clearComponentCache()` on file changes, which calls `deleteComponentMtimeIndexEntry(componentDir)`.
### Fix Focus Areas
- scopes/workspace/modules/fs-cache/fs-cache.ts[33-57]
### Implementation notes
Pick one:
- On `deleteComponentMtimeIndexEntry()`, if `componentsMtimeIndexBuilding` is set, bump `componentsMtimeIndexGen` (and/or call `clearComponentsMtimeIndex()`) so the in-flight build result will not be cached as canonical.
- Alternatively track pending deletions and apply them to the resolved index before caching/returning it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

PR Summary by Qodo

perf(workspace): batch deps-cache invalidation with shared workspace mtime index
✨ Enhancement 📝 Documentation 🕐 40+ Minutes

Grey Divider

Description

• Replace per-component recursive mtime scans with a single workspace scan shared across components.
• Memoize the workspace-wide mtime index in FsCache and invalidate it via existing clear-cache
 hooks.
• Update the component-loading redesign doc with corrected profiling conclusions and new perf
 numbers.
Diagram

graph TD
DL["DependenciesLoader"] --> FC["FsCache mtime index"] --> BLD["buildDirsLastModifiedIndex"] --> FS[("Workspace FS")]
DL --> FB["Per-component fallback"] --> FS
WS["Workspace clear-cache"] --> FC
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Persistent mtime index across commands
  • ➕ Eliminates the workspace scan on every command; best-case warm performance on very large workspaces
  • ➕ Can be incrementally updated on watch events rather than rebuilt
  • ➖ More correctness edge cases (stale index on crashes, external file changes, git operations)
  • ➖ Requires durable storage format/versioning and robust invalidation rules
2. Custom filesystem walker instead of globby(stats: true)
  • ➕ Potentially fewer allocations and lower overhead than globby’s generalized matcher
  • ➕ More control over symlink handling and ignore rules
  • ➖ Higher maintenance surface (OS quirks, symlinks, dotfiles)
  • ➖ Reimplements behavior globby already provides; risk of subtle traversal bugs

Recommendation: The PR’s approach (single scan per command + memoized index with clear-cache invalidation and per-component fallback) is the best near-term tradeoff: it removes the pathological N× scan behavior while keeping correctness localized and leveraging existing cache-clear pathways (watch/start). A persistent cross-command index could yield further gains but adds significant correctness and maintenance risk.

Files changed (6) +184 / -28

Enhancement (5) +130 / -4
dependencies-loader.tsUse workspace-wide mtime index for deps-cache staleness checks +25/-4

Use workspace-wide mtime index for deps-cache staleness checks

• Replaces the per-component recursive last-modified calculation with a helper that consults a shared workspace mtime index. Falls back to the previous per-component scan when the index lacks an entry (e.g., after single-component cache clears) and memoizes the computed fallback result back into the index.

scopes/dependencies/dependencies/dependencies-loader/dependencies-loader.ts

index.tsExport buildDirsLastModifiedIndex from fs last-modified toolbox +1/-0

Export buildDirsLastModifiedIndex from fs last-modified toolbox

• Adds the new multi-directory last-modified index builder to the package’s public exports so workspace code can reuse it.

scopes/toolbox/fs/last-modified/index.ts

last-modified.tsAdd single-scan multi-directory last-modified index builder +64/-0

Add single-scan multi-directory last-modified index builder

• Introduces buildDirsLastModifiedIndex(), which computes max mtime per component directory via a single globby scan (including nested dirs) while ignoring node_modules by default. Adds ownerDir() to associate scanned entries with the deepest matching input dir and explicitly stats the root dirs to capture deletions directly under them.

scopes/toolbox/fs/last-modified/last-modified.ts

fs-cache.tsMemoize command-scoped component mtime index with safe invalidation +37/-0

Memoize command-scoped component mtime index with safe invalidation

• Adds a cached componentsMtimeIndex plus a shared in-flight build promise to prevent duplicate concurrent builds. Implements generation-based protection so an index cleared mid-build doesn’t get re-cached, and provides APIs to clear the whole index or delete a single component entry.

scopes/workspace/modules/fs-cache/fs-cache.ts

workspace.tsInvalidate shared mtime index via existing workspace cache clear hooks +3/-0

Invalidate shared mtime index via existing workspace cache clear hooks

• Extends clearAllComponentsCache() to drop the entire mtime index and clearComponentCache() to delete the specific component directory entry. This keeps watch/start correctness by ensuring future loads recompute mtimes after cache invalidation events.

scopes/workspace/workspace/workspace.ts

Documentation (1) +54 / -24
component-loading-redesign.mdUpdate redesign doc with corrected profiling + batched invalidation results +54/-24

Update redesign doc with corrected profiling + batched invalidation results

• Marks the batched deps-cache invalidation scan as shipped in Phase 2 and updates the profiling narrative to reflect that the hotspot was filesystem traversal (not dependency object materialization). Adds the measured syscall/statFiles reductions and clarifies wall-time vs aggregate self-time interpretation.

scopes/workspace/workspace/component-loading-redesign.md

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📜 Skill insights (0)

Grey Divider


Action required

1. Stuck index build promise 🐞 Bug ☼ Reliability
Description
FsCache.getOrBuildComponentsMtimeIndex() only clears componentsMtimeIndexBuilding on success, so
if the build rejects once the rejected promise is retained and all later callers will keep failing
in the same process. This can break deps-cache reads for long-lived commands (e.g. watch/start)
until restart.
Code

scopes/workspace/modules/fs-cache/fs-cache.ts[R33-45]

+  async getOrBuildComponentsMtimeIndex(build: () => Promise<Map<string, number>>): Promise<Map<string, number>> {
+    if (this.componentsMtimeIndex) return this.componentsMtimeIndex;
+    if (!this.componentsMtimeIndexBuilding) {
+      const gen = this.componentsMtimeIndexGen;
+      this.componentsMtimeIndexBuilding = build().then((index) => {
+        // if the index was cleared while building, don't cache this now-stale result as canonical.
+        if (gen === this.componentsMtimeIndexGen) this.componentsMtimeIndex = index;
+        this.componentsMtimeIndexBuilding = undefined;
+        return index;
+      });
+    }
+    return this.componentsMtimeIndexBuilding;
+  }
Evidence
The memoized promise is only cleared in the success path, so a rejection leaves
componentsMtimeIndexBuilding set forever. The build path can reject because getPathStatIfExist
rethrows non-ENOENT errors and globby(..., { stats: true }) can also reject.

scopes/workspace/modules/fs-cache/fs-cache.ts[33-45]
scopes/toolbox/fs/last-modified/last-modified.ts[35-41]
scopes/toolbox/fs/last-modified/last-modified.ts[82-100]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`FsCache.getOrBuildComponentsMtimeIndex()` memoizes the first build in `componentsMtimeIndexBuilding`, but it only resets that field in the `.then()` success handler. If the build throws/rejects (e.g. a filesystem stat error), the rejected promise remains stored and future calls cannot recover.

### Issue Context
The build function used by deps invalidation (`buildDirsLastModifiedIndex`) performs a large glob + many stats, and can reject on non-ENOENT filesystem errors.

### Fix Focus Areas
- scopes/workspace/modules/fs-cache/fs-cache.ts[33-45]

### Implementation notes
- Ensure `componentsMtimeIndexBuilding` is cleared in a `finally` (or a `.catch()` that rethrows) so a transient build failure doesn’t permanently poison the cache.
- Consider also logging the error at debug/trace level to aid diagnosing glob/stat failures.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Inflight build invalidation race 🐞 Bug ≡ Correctness
Description
deleteComponentMtimeIndexEntry() only deletes from an already-built map and does not invalidate an
in-flight index build, so a build started before a watch-triggered cache clear can still be cached
as canonical afterward. This can cause deps-cache staleness checks to miss a just-changed component
and incorrectly reuse cached dependencies.
Code

scopes/workspace/modules/fs-cache/fs-cache.ts[R33-57]

+  async getOrBuildComponentsMtimeIndex(build: () => Promise<Map<string, number>>): Promise<Map<string, number>> {
+    if (this.componentsMtimeIndex) return this.componentsMtimeIndex;
+    if (!this.componentsMtimeIndexBuilding) {
+      const gen = this.componentsMtimeIndexGen;
+      this.componentsMtimeIndexBuilding = build().then((index) => {
+        // if the index was cleared while building, don't cache this now-stale result as canonical.
+        if (gen === this.componentsMtimeIndexGen) this.componentsMtimeIndex = index;
+        this.componentsMtimeIndexBuilding = undefined;
+        return index;
+      });
+    }
+    return this.componentsMtimeIndexBuilding;
+  }
+
+  /** drop the whole index (e.g. on a full workspace cache clear). */
+  clearComponentsMtimeIndex() {
+    this.componentsMtimeIndex = undefined;
+    this.componentsMtimeIndexBuilding = undefined;
+    this.componentsMtimeIndexGen += 1;
+  }
+
+  /** drop a single component's entry so its next load recomputes it (e.g. on a watch file change). */
+  deleteComponentMtimeIndexEntry(rootDir: string) {
+    this.componentsMtimeIndex?.delete(rootDir);
+  }
Evidence
Watch triggers workspace.clearComponentCache(), which deletes an index entry, but the delete
method only affects the already-built map. The build caching decision is based on the generation
captured at build start and delete does not bump it, so an in-flight build can still be accepted as
canonical.

scopes/workspace/watcher/watcher.ts[651-664]
scopes/workspace/workspace/workspace.ts[840-846]
scopes/workspace/modules/fs-cache/fs-cache.ts[33-45]
scopes/workspace/modules/fs-cache/fs-cache.ts[54-57]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`deleteComponentMtimeIndexEntry()` is intended to force recomputation for a component after `workspace.clearComponentCache()`, but it’s a no-op if the shared index hasn’t been materialized yet (or is still building). Because the in-flight build is still eligible to be cached (generation unchanged), it can reintroduce the stale entry.

### Issue Context
Watch flows call `workspace.clearComponentCache()` on file changes, which calls `deleteComponentMtimeIndexEntry(componentDir)`.

### Fix Focus Areas
- scopes/workspace/modules/fs-cache/fs-cache.ts[33-57]

### Implementation notes
Pick one:
- On `deleteComponentMtimeIndexEntry()`, if `componentsMtimeIndexBuilding` is set, bump `componentsMtimeIndexGen` (and/or call `clearComponentsMtimeIndex()`) so the in-flight build result will not be cached as canonical.
- Alternatively track pending deletions and apply them to the resolved index before caching/returning it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread scopes/workspace/modules/fs-cache/fs-cache.ts Outdated
…nd watch races

Address qodo review on the deps-cache invalidation index:
- clear the in-flight build promise in `finally`, so a transient build
  rejection (glob/stat error) no longer poisons all later reads in a
  long-lived process (watch/start).
- in `deleteComponentMtimeIndexEntry`, bump the generation when a build is
  in-flight, so a watch-triggered clear that races the first build discards
  that build's result instead of caching the now-stale entry as canonical.
- fall back to the per-component scan if the centralized index build throws,
  preserving fault isolation (one bad dir no longer fails every load).
@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit db242dc

…eeds

The first cut ignored node_modules entirely, which is a correctness regression:
auto-detect resolves imports *through* node_modules and reads each direct dep's
package.json (name/componentId), so the cached result depends on node_modules
content — ignoring it drops the invalidation safety net.

Restrict instead of ignore: scan component source (excluding the deep
node_modules subtree) plus each component's `node_modules` and
`node_modules/@scope` directory mtimes. A dep add/remove/version-relink/
componentId-change all go through a relink that bumps those dirs, and the
dependency-tree builder stops at the package boundary, so the deep tree and
transitive store are irrelevant to the cache.

Warm `bit status` fs syscalls 74.3k -> 46.9k (~37%); correctness preserved
(verified: source, node_modules, and @scope changes all invalidate).
@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 3b9a34f

…mponent scan instead

The shared command-scoped index added memoization/invalidation complexity (and
the two reliability bugs qodo flagged) for no syscall benefit — batching saves
nothing here, and it over-scanned single-component commands (bit show would scan
all components to check one). Reverted it (fs-cache, workspace, dependencies-loader
back to their original state; deps fs-cache and per-component invalidation untouched).

Keep only the actual win: the per-component freshness scan
(getLastModifiedDirTimestampMs) now stops at the node_modules boundary and takes
just the node_modules + node_modules/@scope dir mtimes — auto-detect never
traverses into a package, so the deep tree/transitive store are irrelevant.

Warm `bit status` fs syscalls 74.3k -> 46.4k (~37%); correctness verified
(source / node_modules / @scope changes all invalidate).
@davidfirst

Copy link
Copy Markdown
Member Author

Simplified the approach (commit 7e1857b): dropped the shared command-scoped mtime index entirely and instead just restrict the existing per-component freshness scan to the node_modules footprint the cache actually needs.

Why: the shared index added memoization/invalidation complexity (including the two reliability issues flagged above) for no syscall benefit — batching saves nothing here (measured: per-component 46.4k vs batched 46.9k), and it over-scanned single-component commands (bit show comp1 would scan all components to check one). The deps fs-cache and its per-component invalidation are now untouched (fs-cache.ts / workspace.ts / dependencies-loader.ts back to their original state) — so the earlier index-related concerns no longer apply.

Net change is now ~1 file: getLastModifiedDirTimestampMs stops at the node_modules boundary and takes only the node_modules/@scope dir mtimes (auto-detect never traverses into a package). Warm bit status fs syscalls 74.3k → 46.4k (~37%); correctness verified (source / node_modules / @scope changes all invalidate).

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 7e1857b

Comment on lines +27 to +30
const scopeDirs = await globby(`${rootDir}/node_modules/@*`, {
onlyDirectories: true,
followSymbolicLinks: false,
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Symlinked scopes not tracked 🐞 Bug ≡ Correctness

getLastModifiedDirTimestampMs() uses globby() with { onlyDirectories: true, followSymbolicLinks:
false } for ${rootDir}/node_modules/@*, which can omit @scope entries that are symbolic links;
changes under those scopes won’t affect the freshness signal and can incorrectly reuse stale
deps-cache.
Agent Prompt
### Issue description
`getLastModifiedDirTimestampMs()` builds `scopeDirs` via `globby(`${rootDir}/node_modules/@*`, { onlyDirectories: true, followSymbolicLinks: false })`. If an `@scope` folder in `node_modules` is represented as a symlink, it may be excluded from `scopeDirs`, so changes inside that scope (e.g., relinks under `@scope/`) won't bump the computed last-modified timestamp and the deps fs-cache can be treated as fresh incorrectly.

### Issue Context
Elsewhere in the repo, node_modules entries (including scoped ones) are explicitly treated as potentially being either directories or symlinks (and scoped entries are recursed into), so the scan should include symlinked `@scope` roots as well.

### Fix Focus Areas
- scopes/toolbox/fs/last-modified/last-modified.ts[17-32]

### Suggested fix
Replace the `globby(`${rootDir}/node_modules/@*`, ...)` call with a non-recursive directory listing of `${rootDir}/node_modules`:
- `readdir(node_modulesPath, { withFileTypes: true })`
- filter entries whose name starts with `@` and where `dirent.isDirectory()` **or** `dirent.isSymbolicLink()`
- add `${rootDir}/node_modules/<scopeName>` paths to `scopeDirs`

This avoids deep traversal while still capturing symlinked scope directories.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 5bae110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant