Skip to content

feat: add ignoreAdditionalDirs config to let repos extend IGNORE_DIRS#1666

Merged
carlos-alm merged 8 commits into
mainfrom
feat/issue-1649
Jun 21, 2026
Merged

feat: add ignoreAdditionalDirs config to let repos extend IGNORE_DIRS#1666
carlos-alm merged 8 commits into
mainfrom
feat/issue-1649

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • Adds ignoreAdditionalDirs: string[] to .codegraphrc.json (and CodegraphConfig / DEFAULTS) — an array of additional directory names that are merged with the built-in IGNORE_DIRS set at file-collection time
  • Adds buildIgnoreSet(additionalDirs?) helper to shared/constants.ts that produces a merged ignore set without mutating the global IGNORE_DIRS
  • Removes crates from the global IGNORE_DIRS default (it was baking a repo-specific Rust workspace carve-out into every analysis); adds "ignoreAdditionalDirs": ["crates"] to this repo's own .codegraphrc.json instead
  • Includes ignoreAdditionalDirs in BUILD_HASH_KEYS so changing the value triggers a full rebuild

Test plan

  • npx vitest run tests/unit/constants.test.ts — new tests verify crates is not in IGNORE_DIRS, buildIgnoreSet merges correctly, and immutability is preserved
  • npx vitest run tests/unit/builder.test.ts — new tests verify collectFiles respects ignoreAdditionalDirs alone and combined with ignoreDirs
  • npm run lint — clean (no warnings)
  • Full unit suite npx vitest run tests/unit/ — 1137 tests pass

Closes #1649

Impact: 4 functions changed, 8 affected
…1649

Impact: 1 functions changed, 5 affected
…1649

Impact: 3 functions changed, 8 affected
Add an `ignoreAdditionalDirs` key to `.codegraphrc.json` (array of strings)
that is merged with the global IGNORE_DIRS set at file-collection time. This
lets each repo declare its own carve-outs without baking them into the
hardcoded global default.

- Add `buildIgnoreSet(additionalDirs?)` helper to `shared/constants.ts` that
  merges IGNORE_DIRS with any extra dirs without mutating the original set
- Add `ignoreAdditionalDirs: string[]` to `CodegraphConfig` in `types.ts` and
  `DEFAULTS` in `infrastructure/config.ts`; include it in `BUILD_HASH_KEYS`
  so config changes trigger a full rebuild
- Update `collectFiles` in `builder/helpers.ts` to merge both `ignoreDirs`
  and `ignoreAdditionalDirs` into the walk's ignore set via `buildIgnoreSet`
- Remove `crates` from the global IGNORE_DIRS default — it was added to
  handle NAPI-RS artifacts in this repo's Rust workspace, but silently
  excluded `crates/` in every other codebase; add
  `"ignoreAdditionalDirs": ["crates"]` to this repo's `.codegraphrc.json`
  instead

Closes #1649

Impact: 6 functions changed, 2 affected
@greptile-apps

greptile-apps Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces ignoreAdditionalDirs: string[] as a per-repo config option for extending the built-in IGNORE_DIRS set, and uses it to move the crates entry out of the global default and into this repo's own .codegraphrc.json. The JS/WASM path is handled cleanly via a new buildIgnoreSet() helper; the watcher is also updated to load config and respect the merged ignore set.

  • buildIgnoreSet helper (src/shared/constants.ts): merges IGNORE_DIRS with ignoreDirs and ignoreAdditionalDirs without mutating the global, with ReadonlySet<string> used throughout to prevent accidental mutation.
  • Watcher alignment (src/domain/graph/watcher.ts): watch mode now loads repo config at startup and applies the same merged ignore set used by the batch build path.
  • Native Rust engine gap (crates/codegraph-core/src/infrastructure/config.rs / pipeline.rs): BuildConfig has no ignore_additional_dirs field, so ignoreAdditionalDirs values from the JSON config are silently dropped when the native engine runs. The crates case is covered in this repo by the pre-existing exclude: [\"crates/**\"] glob, but users who rely solely on ignoreAdditionalDirs without a matching exclude will see the native engine process those directories.

Confidence Score: 4/5

Safe to merge for this repo given the existing exclude: ["crates/**"] glob provides a fallback for the native engine, but the new config key does not work end-to-end on the native path for general users.

The Rust BuildConfig was not updated to include ignore_additional_dirs, so ignoreAdditionalDirs values are silently discarded by the native engine. For this specific repo the exclude glob provides a workaround, but any other user relying on ignoreAdditionalDirs alone without a matching exclude will find those directories processed by the native engine — the feature is only half-implemented.

crates/codegraph-core/src/infrastructure/config.rs and crates/codegraph-core/src/domain/graph/builder/pipeline.rs need an ignore_additional_dirs field and merge logic to match what the JS side does.

Important Files Changed

Filename Overview
crates/codegraph-core/src/infrastructure/config.rs Rust BuildConfig lacks ignore_additional_dirs field — ignoreAdditionalDirs values are silently dropped by the native engine, diverging from the JS/WASM path.
src/shared/constants.ts Adds buildIgnoreSet() helper that returns a new merged Set without mutating IGNORE_DIRS; crates removed from default list.
src/domain/graph/builder/helpers.ts shouldSkipEntry and CollectContext updated to use ReadonlySet ignoreSet merging IGNORE_DIRS, ignoreDirs, and ignoreAdditionalDirs.
src/domain/graph/watcher.ts Watcher now loads repo config and builds a merged ignoreSet so watch mode respects the same ignore rules as the batch build path.
src/infrastructure/config.ts ignoreAdditionalDirs added to DEFAULTS and BUILD_HASH_KEYS; config change will correctly trigger a full rebuild.
src/types.ts ignoreAdditionalDirs: string[] added to CodegraphConfig interface with a JSDoc comment.
crates/codegraph-core/src/domain/graph/builder/stages/collect_files.rs crates removed from DEFAULT_IGNORE_DIRS; the gap is partially covered by exclude: ["crates/**"] in .codegraphrc.json, but the directory is still traversed (just files excluded at glob level).
.codegraphrc.json Adds ignoreAdditionalDirs: ["crates"] to repo config; existing exclude: ["crates/**"] remains as the safety net for the native engine.
src/domain/graph/builder/stages/native-orchestrator.ts Comment updated to reflect the finally-block FK restore added in the previous review cycle; no functional change to this file.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    RC[".codegraphrc.json\nignoreAdditionalDirs: ['crates']"] --> LC[loadConfig]
    LC --> BGS["buildIgnoreSet(\n  ignoreDirs + ignoreAdditionalDirs\n)"]
    BGS --> IGN["ignoreSet\n(ReadonlySet)"]

    IGN --> CF["collectFiles (JS/WASM)\nshouldSkipEntry checks ignoreSet"]
    IGN --> WA["watcher.ts\ncollectTrackedFiles / shouldIgnorePath"]

    subgraph "Native Engine path"
        LC2["ctx.config (JSON)"] --> RBC["Rust BuildConfig\nonly reads ignoreDirs"]
        RBC --> PL["pipeline.rs\ncollect_files(&config.ignore_dirs, ...)"]
        PL --> MISS["ignoreAdditionalDirs\n⚠️ SILENTLY DROPPED"]
        MISS --> EXCL["Fallback: exclude glob\n'crates/**' filters files\nbut dir is still traversed"]
    end

    LC --> LC2
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    RC[".codegraphrc.json\nignoreAdditionalDirs: ['crates']"] --> LC[loadConfig]
    LC --> BGS["buildIgnoreSet(\n  ignoreDirs + ignoreAdditionalDirs\n)"]
    BGS --> IGN["ignoreSet\n(ReadonlySet)"]

    IGN --> CF["collectFiles (JS/WASM)\nshouldSkipEntry checks ignoreSet"]
    IGN --> WA["watcher.ts\ncollectTrackedFiles / shouldIgnorePath"]

    subgraph "Native Engine path"
        LC2["ctx.config (JSON)"] --> RBC["Rust BuildConfig\nonly reads ignoreDirs"]
        RBC --> PL["pipeline.rs\ncollect_files(&config.ignore_dirs, ...)"]
        PL --> MISS["ignoreAdditionalDirs\n⚠️ SILENTLY DROPPED"]
        MISS --> EXCL["Fallback: exclude glob\n'crates/**' filters files\nbut dir is still traversed"]
    end

    LC --> LC2
Loading

Reviews (6): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile

Comment on lines +2110 to +2114
try {
ctx.nativeDb.exec('PRAGMA foreign_keys = OFF');
} catch {
// exec may not exist on very old addon versions — safe to ignore
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 PRAGMA foreign_keys = OFF is never re-enabled for this connection

openNativeDatabase (line 1893) returns early with if (ctx.nativeDb || !ctx.nativeAvailable) return, so the same connection is reused across incremental builds within the same PipelineContext lifetime. FK enforcement is disabled once here and remains OFF for all subsequent buildGraph() calls on that connection — not just for the first build. For older binaries this is the intended workaround, but for newer binaries (≥ v3.14) it silently discards any FK-level data-integrity signals they could otherwise surface. Adding an explicit ctx.nativeDb.exec('PRAGMA foreign_keys = ON') after buildGraph() returns would restore enforcement for follow-up operations without affecting the old-binary workaround.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added PRAGMA foreign_keys = ON immediately after buildGraph() returns (wrapped in the same try/catch for old addon versions). This restores FK enforcement for JS post-passes (CHA, dataflow, structure) without affecting the old-binary workaround during the Rust build phase.

Comment thread src/shared/constants.ts Outdated
Comment on lines +41 to +44
export function buildIgnoreSet(additionalDirs?: string[]): Set<string> {
if (!additionalDirs || additionalDirs.length === 0) return IGNORE_DIRS;
return new Set([...IGNORE_DIRS, ...additionalDirs]);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 buildIgnoreSet returns a direct reference to the mutable global IGNORE_DIRS when no additional dirs are provided. Any caller that receives this return value and calls .add() or .delete() on it would mutate the shared global — a footgun that the current call sites avoid, but that future code could hit unexpectedly. Returning IGNORE_DIRS as ReadonlySet<string> communicates the aliasing intent in the type system.

Suggested change
export function buildIgnoreSet(additionalDirs?: string[]): Set<string> {
if (!additionalDirs || additionalDirs.length === 0) return IGNORE_DIRS;
return new Set([...IGNORE_DIRS, ...additionalDirs]);
}
export function buildIgnoreSet(additionalDirs?: string[]): ReadonlySet<string> {
if (!additionalDirs || additionalDirs.length === 0) return IGNORE_DIRS;
return new Set([...IGNORE_DIRS, ...additionalDirs]);
}

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — changed the return type of buildIgnoreSet to ReadonlySet<string> in constants.ts, and updated shouldSkipEntry and CollectContext.ignoreSet in helpers.ts to accept ReadonlySet<string> consistently. This prevents accidental mutation of the shared IGNORE_DIRS global through the no-extras fast path.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

13 functions changed12 callers affected across 8 files

  • shouldSkipEntry in src/domain/graph/builder/helpers.ts:59 (2 transitive callers)
  • CollectContext.ignoreSet in src/domain/graph/builder/helpers.ts:143 (0 transitive callers)
  • walkCollect in src/domain/graph/builder/helpers.ts:183 (1 transitive callers)
  • collectFiles in src/domain/graph/builder/helpers.ts:230 (0 transitive callers)
  • tryNativeOrchestrator in src/domain/graph/builder/stages/native-orchestrator.ts:2089 (5 transitive callers)
  • shouldIgnorePath in src/domain/graph/watcher.ts:13 (3 transitive callers)
  • collectTrackedFiles in src/domain/graph/watcher.ts:143 (3 transitive callers)
  • WatcherContext.ignoreSet in src/domain/graph/watcher.ts:173 (0 transitive callers)
  • setupWatcher in src/domain/graph/watcher.ts:177 (2 transitive callers)
  • startPollingWatcher in src/domain/graph/watcher.ts:232 (2 transitive callers)
  • startNativeWatcher in src/domain/graph/watcher.ts:280 (2 transitive callers)
  • buildIgnoreSet in src/shared/constants.ts:41 (4 transitive callers)
  • CodegraphConfig.ignoreAdditionalDirs in src/types.ts:1355 (0 transitive callers)

Prevents accidental mutation of the shared IGNORE_DIRS global when
the caller receives a direct reference (the no-extras fast path).
Update shouldSkipEntry and CollectContext.ignoreSet to accept
ReadonlySet<string> consistently.

Impact: 3 functions changed, 2 affected
FK enforcement is disabled before buildGraph() as a workaround for
old-binary purge failures (< v3.14). Re-enable it immediately after
buildGraph() returns so JS post-passes (CHA, dataflow, structure) run
with full FK enforcement rather than inheriting the workaround.

Impact: 1 functions changed, 5 affected
@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

The watcher previously used the global IGNORE_DIRS directly via shouldIgnore(),
so removing 'crates' from IGNORE_DIRS (the main change in this PR) caused the
watcher to traverse and trigger rebuilds for files under crates/ — directly
contradicting the ignoreAdditionalDirs exclusion that works correctly in batch
builds via collectFiles.

Fix: load .codegraphrc.json in setupWatcher, build the merged ignore set with
buildIgnoreSet(ignoreDirs + ignoreAdditionalDirs), store it in WatcherContext,
and thread it through collectTrackedFiles (polling mode) and shouldIgnorePath
(native OS watcher mode). Both watcher paths now respect the same exclusion
set as the batch build path.

Impact: 6 functions changed, 4 affected
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Fixed the watcher regression identified in the Greptile summary: watcher.ts now loads .codegraphrc.json in setupWatcher, builds the merged ignore set via buildIgnoreSet(ignoreDirs + ignoreAdditionalDirs), stores it in WatcherContext, and threads it through both collectTrackedFiles (polling mode) and shouldIgnorePath (native OS watcher mode). The watcher now respects ignoreAdditionalDirs the same way the batch build path does — so crates/ is correctly excluded in watch mode when "ignoreAdditionalDirs": ["crates"] is set in .codegraphrc.json.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit b6c1934 into main Jun 21, 2026
29 checks passed
@carlos-alm carlos-alm deleted the feat/issue-1649 branch June 21, 2026 08:48
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: make IGNORE_DIRS configurable per-repo via .codegraphrc.json

1 participant