Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 79 additions & 2 deletions src/domain/search/models.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,33 @@
import { execFileSync } from 'node:child_process';
import { existsSync, realpathSync } from 'node:fs';
import { createRequire } from 'node:module';
import path from 'node:path';
import { createInterface } from 'node:readline';
import { info } from '../../infrastructure/logger.js';
import { ConfigError, EngineError } from '../../shared/errors.js';

const _require = createRequire(import.meta.url);
const NPM_BIN = process.platform === 'win32' ? 'npm.cmd' : 'npm';

/** Resolve a path to its real (symlink-free) form, or return it unchanged if that fails. */
function tryRealpath(p: string): string {
try {
return realpathSync(p);
} catch {
return p;
}
}

/**
* Normalize a path for equality comparison. Windows filesystems are
* case-insensitive (but case-preserving), and paths sourced from
* `execFileSync` output vs. `require.resolve` can differ in drive-letter or
* segment casing (e.g. `C:\Users\...` vs `c:\users\...`) despite pointing at
* the same directory — so comparisons must fold case on win32.
*/
function normalizeForComparison(p: string): string {
return process.platform === 'win32' ? p.toLowerCase() : p;
}

/**
* Resolve the directory where `npm install` should run so the installed
Expand Down Expand Up @@ -35,6 +57,51 @@ export function resolveNpmInstallCwd(): string | undefined {
}
}

/**
* True when `dir` is npm's own global modules root — the directory whose
* `node_modules` is npm's global install target (and therefore already
* contains npm's own installation, `node_modules/npm`).
*
* Running `npm install` with this as `cwd` makes npm reify its *own*
* dependency tree as a side effect of installing the requested package.
* On at least one observed setup (Homebrew-managed Node/npm on macOS) this
* deleted npm's own installation and other co-located global packages
* before the install's lifecycle-script error even surfaced (#1720).
* Global installs of codegraph must never run `npm install` here.
*
* Authoritative check: ask npm itself for its global modules root
* (`npm root -g`) and compare against `dir`. This avoids misclassifying a
* normal project that merely happens to depend on the `npm` package itself
* (e.g. a tool that shells out to npm) — such a project's `node_modules/npm`
* would satisfy the old file-existence heuristic without actually being
* npm's global root. Falls back to the file-existence heuristic only if the
* `npm root -g` call itself fails (e.g. npm binary unavailable in PATH).
*
* @internal Exported for unit tests; not part of the public barrel.
*/
export function isNpmGlobalModulesRoot(dir: string | undefined): boolean {
if (!dir) return false;

const candidate = normalizeForComparison(tryRealpath(path.join(dir, 'node_modules')));
try {
const globalRoot = execFileSync(NPM_BIN, ['root', '-g'], {
encoding: 'utf8',
timeout: 10_000,
}).trim();
if (globalRoot) {
return normalizeForComparison(tryRealpath(globalRoot)) === candidate;
}
} catch {
// npm unavailable/unresolvable — fall back to the heuristic below.
}

try {
return existsSync(path.join(dir, 'node_modules', 'npm', 'package.json'));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Local Npm Dependency Skips Install

When @optave/codegraph is installed locally in a project that also has npm in node_modules, resolveNpmInstallCwd() resolves to that project root and this check classifies it as npm's global root. promptInstall() then skips the safe local npm install --no-save @huggingface/transformers path and tells the user to install globally, leaving semantic search unavailable for a normal project install.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — isNpmGlobalModulesRoot() now asks npm directly via npm root -g and compares the real (symlink-resolved) path against the candidate directory, instead of just checking for the presence of node_modules/npm/package.json. A local project that happens to depend on the npm package itself no longer gets misclassified as npm's global root, since its node_modules/npm won't match npm's actual configured global root. The old file-existence check is kept only as a fallback for the rare case where the npm root -g call itself fails (e.g. npm binary unavailable in PATH). Added test coverage for: (1) authoritative match via npm root -g, (2) a project with npm as a dependency that is correctly NOT flagged, (3) fallback behavior when npm root -g fails. See commit ffa1adf.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Fallback Misclassifies Projects

When npm root -g fails, this fallback goes back to treating any node_modules/npm/package.json as npm's global install root. A local project that depends on npm can still hit this path if the probe times out or the PATH/config used for the probe is broken, while the normal local npm install --no-save @huggingface/transformers path would have worked. In that case promptInstall() skips the local install and tells the user to install globally, leaving semantic search unavailable for the local project.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this as-is — it's an intentional safety trade-off, not an oversight.

The fallback only triggers when npm root -g itself fails (ENOENT, timeout, broken PATH/config). In that scenario there are two ways to get it wrong:

  • False positive (what you're flagging): a local project that happens to depend on the npm package gets misclassified as npm's global root, so auto-install is skipped and the user has to run npm install @huggingface/transformers manually once. Annoying, but harmless and fully recoverable.
  • False negative (removing the fallback entirely, or defaulting to "not global" on probe failure): if npm can't be queried, we'd fall through to running npm install unconditionally — which is exactly the destructive scenario bug: embed auto-install cwd resolves into global npm modules root for global installs, breaking install #1720 exists to prevent, and we'd have zero signal to catch it.

Given the original bug deleted npm's own global installation and co-located packages, I'd rather over-skip (extra manual step, easily recoverable) than under-skip (risk reification of npm's global tree again). It's also a narrow double-failure: npm root -g has to fail and the local project has to independently ship node_modules/npm/package.json. If npm root -g fails because npm/PATH is broken, the subsequent npm install call would very likely fail too anyway, so skipping straight to manual guidance is arguably the better UX even in the true-local case.

Happy to revisit if you see a way to disambiguate the two cases without another npm round-trip, but I don't think there's a safe way to remove the fallback outright.

} catch {
return false;
}
}

export interface ModelConfig {
name: string;
dim: number;
Expand Down Expand Up @@ -138,7 +205,6 @@ export const MODELS: Record<string, ModelConfig> = {
export const EMBEDDING_STRATEGIES: readonly string[] = ['structured', 'source'];

export const DEFAULT_MODEL: string = 'nomic';
const NPM_BIN = process.platform === 'win32' ? 'npm.cmd' : 'npm';
const BATCH_SIZE_MAP: Record<string, number> = {
minilm: 32,
'jina-small': 16,
Expand Down Expand Up @@ -173,6 +239,14 @@ export function getModelConfig(modelKey?: string): ModelConfig {
*/
export function promptInstall(packageName: string): Promise<boolean> {
const installCwd = resolveNpmInstallCwd();

if (isNpmGlobalModulesRoot(installCwd)) {
info(
`${packageName} is missing, but codegraph is installed globally — auto-install is skipped to avoid modifying npm's own global installation.\nInstall it yourself with:\n npm install -g ${packageName}`,
);
return Promise.resolve(false);
Comment on lines +245 to +247

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Global Guidance Stays Unresolvable

This recovery path tells users to run npm install -g @huggingface/transformers, but loadTransformers() loads the package with a plain dynamic import(pkg). A globally installed dependency is not resolved from the globally installed codegraph module unless it is installed under that module's own resolution path, so a user can follow this instruction and still hit the same ENGINE_UNAVAILABLE error.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigated this — I don't believe it's actually broken, and here's why: Node's module resolution algorithm (both CJS require.resolve and ESM import()) walks up through every ancestor node_modules directory from the resolving file's real (symlink-resolved) path, not just the importing package's own node_modules. When codegraph is installed globally, it lives at <npmGlobalRoot>/node_modules/@optave/codegraph/.... Running npm install -g @huggingface/transformers places that package at <npmGlobalRoot>/node_modules/@huggingface/transformers — a sibling directory. I verified with a real filesystem test (mirroring codegraph's actual nested file structure, src/domain/search/models.ts, and using both require.resolve and dynamic import()) that this sibling resolves correctly in both CJS and ESM — the walk-up algorithm doesn't stop at the originating package's own directory. This matches how resolveNpmInstallCwd() already works for the auto-install case (which relies on the same walk-up behavior). I also confirmed npm's bin symlink (/opt/homebrew/bin/npm) resolves via realpath back into <npmGlobalRoot>/node_modules/npm/bin/..., so the same realpath-based resolution applies uniformly. Given this, the npm install -g <pkg> guidance should correctly make the package resolvable. If you have a concrete repro where this breaks (e.g. a package manager or Node version manager with a different global layout), please share it and I'll dig further — happy to be proven wrong with a reproduction.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — I was wrong. The walk-up algorithm does traverse to <npmGlobalRoot>/node_modules/ from within <npmGlobalRoot>/node_modules/@optave/codegraph/..., so a sibling global package is on the resolution path. The npm install -g <pkg> guidance is valid.

Thanks for the thorough investigation. Shall I remember this as a rule: "Do not flag global sibling node_modules resolution as broken — Node's walk-up algorithm resolves sibling packages under a shared global node_modules root correctly in both CJS and ESM"?

}

if (!process.stdin.isTTY) {
info(`Installing ${packageName} (optional dependency for semantic search)…`);
try {
Expand Down Expand Up @@ -237,7 +311,10 @@ export async function loadTransformers(): Promise<unknown> {
);
}
}
throw new EngineError(`Semantic search requires ${pkg}.\nInstall it with: npm install ${pkg}`);
const manualInstall = isNpmGlobalModulesRoot(resolveNpmInstallCwd())
? `npm install -g ${pkg}`
: `npm install ${pkg}`;
throw new EngineError(`Semantic search requires ${pkg}.\nInstall it with: ${manualInstall}`);
}
}

Expand Down
129 changes: 129 additions & 0 deletions tests/unit/prompt-install.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
* so every test gets a fresh embedder module with its own mocks.
*/

import fs from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import { afterEach, beforeEach, describe, expect, test, vi } from 'vitest';

Expand Down Expand Up @@ -232,3 +234,130 @@ describe('resolveNpmInstallCwd', () => {
expect(resolveNpmInstallCwd()).toBeUndefined();
});
});

describe('isNpmGlobalModulesRoot', () => {
let tmpDir: string;

beforeEach(() => {
vi.resetModules();
tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-npm-global-'));
});

afterEach(() => {
fs.rmSync(tmpDir, { recursive: true, force: true });
vi.restoreAllMocks();
});

test('returns true when `npm root -g` matches dir/node_modules', async () => {
const execMock = vi.fn(() => `${path.join(tmpDir, 'node_modules')}\n`);
vi.doMock('node:child_process', () => ({ execFileSync: execMock }));

const { isNpmGlobalModulesRoot } = await import('../../src/domain/search/models.js');
expect(isNpmGlobalModulesRoot(tmpDir)).toBe(true);
expect(execMock).toHaveBeenCalledWith(
expectedNpmBin,
['root', '-g'],
expect.objectContaining({ encoding: 'utf8' }),
);
});

test('returns false for a normal project dir even if it depends on the npm package', async () => {
// A project that happens to have `npm` as a dependency must NOT be
// misclassified as npm's global root — only `npm root -g` is authoritative.
fs.mkdirSync(path.join(tmpDir, 'node_modules', 'npm'), { recursive: true });
fs.writeFileSync(path.join(tmpDir, 'node_modules', 'npm', 'package.json'), '{}');
const execMock = vi.fn(() => `${path.join(os.tmpdir(), 'some-other-global-root')}\n`);
vi.doMock('node:child_process', () => ({ execFileSync: execMock }));

const { isNpmGlobalModulesRoot } = await import('../../src/domain/search/models.js');
expect(isNpmGlobalModulesRoot(tmpDir)).toBe(false);
});

test('falls back to node_modules/npm heuristic when `npm root -g` fails', async () => {
fs.mkdirSync(path.join(tmpDir, 'node_modules', 'npm'), { recursive: true });
fs.writeFileSync(path.join(tmpDir, 'node_modules', 'npm', 'package.json'), '{}');
const execMock = vi.fn(() => {
throw new Error('npm: command not found');
});
vi.doMock('node:child_process', () => ({ execFileSync: execMock }));

const { isNpmGlobalModulesRoot } = await import('../../src/domain/search/models.js');
expect(isNpmGlobalModulesRoot(tmpDir)).toBe(true);
});

test('falls back to false when `npm root -g` fails and heuristic dir is a normal project', async () => {
fs.mkdirSync(path.join(tmpDir, 'node_modules', '@optave', 'codegraph'), { recursive: true });
const execMock = vi.fn(() => {
throw new Error('npm: command not found');
});
vi.doMock('node:child_process', () => ({ execFileSync: execMock }));

const { isNpmGlobalModulesRoot } = await import('../../src/domain/search/models.js');
expect(isNpmGlobalModulesRoot(tmpDir)).toBe(false);
});

test('returns false when dir is undefined', async () => {
const { isNpmGlobalModulesRoot } = await import('../../src/domain/search/models.js');
expect(isNpmGlobalModulesRoot(undefined)).toBe(false);
});
});

describe('promptInstall: global codegraph install', () => {
let tmpDir: string;
let origTTY: any;

beforeEach(() => {
vi.resetModules();
origTTY = process.stdin.isTTY;
tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-global-install-'));
// Simulate npm's own global modules root: <tmpDir>/node_modules/npm + the
// globally-installed codegraph package living alongside it.
fs.mkdirSync(path.join(tmpDir, 'node_modules', 'npm'), { recursive: true });
fs.writeFileSync(path.join(tmpDir, 'node_modules', 'npm', 'package.json'), '{}');
fs.mkdirSync(path.join(tmpDir, 'node_modules', '@optave', 'codegraph'), { recursive: true });

const fakePkg = path.join(tmpDir, 'node_modules', '@optave', 'codegraph', 'package.json');
vi.doMock('node:module', () => ({
createRequire: () => ({
resolve: (req: string) => {
if (req === '@optave/codegraph/package.json') return fakePkg;
throw new Error(`Cannot find: ${req}`);
},
}),
}));
});

afterEach(() => {
process.stdin.isTTY = origTTY;
fs.rmSync(tmpDir, { recursive: true, force: true });
vi.restoreAllMocks();
});

test('never invokes npm install and rejects with -g guidance', async () => {
process.stdin.isTTY = undefined;

// `npm root -g` resolves to the same simulated global root set up in
// beforeEach, so isNpmGlobalModulesRoot() classifies it authoritatively
// rather than via the node_modules/npm fallback heuristic.
const execMock = vi.fn((_bin: string, args: string[]) => {
if (args[0] === 'root') return `${path.join(tmpDir, 'node_modules')}\n`;
throw new Error('npm install should never be invoked in this scenario');
});
vi.doMock('node:child_process', () => ({ execFileSync: execMock }));
vi.doMock('@huggingface/transformers', () => {
throw new Error('Cannot find package');
});

const { embed } = await import('../../src/domain/search/index.js');

await expect(embed(['test'], 'minilm')).rejects.toThrow(
'npm install -g @huggingface/transformers',
);
// npm install must never have been attempted — only the read-only `npm root -g` probe.
expect(execMock).not.toHaveBeenCalledWith(
expect.anything(),
expect.arrayContaining(['install']),
expect.anything(),
);
});
});
Loading