Skip to content

feat(sea): let users override the base Node binary#279

Open
cstrahan wants to merge 3 commits into
yao-pkg:mainfrom
cstrahan:cstrahan/sea-node-override
Open

feat(sea): let users override the base Node binary#279
cstrahan wants to merge 3 commits into
yao-pkg:mainfrom
cstrahan:cstrahan/sea-node-override

Conversation

@cstrahan

Copy link
Copy Markdown

Context: I work on a team that distributes software to on-prem customers, many of whom are on old systems like EL 7 with a glibc as old as 2.17. To date, this has kept us targeting very old versions of node, resulting in a web of old dependencies. I have recently built nodejs v24 on EL 7 (so the resulting binaries will run pretty much anywhere), and I would like to use it together with @yao-pkg/pkg in enhanced SEA mode so we can finally upgrade all of our on-prem software.

Unfortunately, SEA mode previously always downloaded a base Node binary from nodejs.org, so the resulting executable was pinned to an official build. That makes it impossible to ship a SEA on top of a custom runtime -- e.g. a Node built against an older glibc (to run on EL7 / older distros), or one configured differently.

The internal SeaOptions / SeaEnhancedOptions already accepted nodePath / useLocalNode; this just exposes them:

--sea-node-path embed a specific Node binary as the base
--sea-use-local-node embed the Node running pkg (process.execPath)

Both are also settable in the pkg config (seaNodePath / seaUseLocalNode) and via the programmatic exec() API. The two are mutually exclusive. The embedded binary's major version must match the target's (the existing version-skew checks still apply).

This is also a prerequisite for adopting node --build-sea (Node >= 25.5): that flag is only meaningful once the base Node is no longer a fixed download. I'll have a separate PR for that coming up as well.

SEA mode previously always downloaded a base Node binary from nodejs.org, so
the resulting executable was pinned to an official build. That makes it
impossible to ship a SEA on top of a custom runtime -- e.g. a Node built
against an older glibc (to run on EL7 / older distros), or one configured
differently.

The internal SeaOptions / SeaEnhancedOptions already accepted `nodePath` /
`useLocalNode`; this just exposes them:

  --sea-node-path <path>   embed a specific Node binary as the base
  --sea-use-local-node     embed the Node running pkg (process.execPath)

Both are also settable in the pkg config (seaNodePath / seaUseLocalNode) and
via the programmatic exec() API. The two are mutually exclusive. The embedded
binary's major version must match the target's (the existing version-skew
checks still apply).

This is also a prerequisite for adopting `node --build-sea` (Node >= 25.5):
that flag is only meaningful once the base Node is no longer a fixed download.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@robertsLando robertsLando left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend reworking around PKG_NODE_PATH rather than new SEA-only flags

pkg already has a documented way to supply a custom base Node binary — the PKG_NODE_PATH env var (pkg-fetch localPlace()), used by standard mode. It does not flow into SEA today: SEA downloads from nodejs.org via its own path in lib/sea.ts, and docs-site/guide/custom-node.md explicitly documents that gap ("SEA mode … does not honour PKG_NODE_PATH … For SEA, use process.execPath … directly").

This PR closes the gap but introduces a second, SEA-only mechanism instead of extending the one that already exists. I'd lean toward the smaller surface / one mental model:

Layer Recommendation
Honor PKG_NODE_PATH in SEA Add — fold the env var into the opts.nodePath fallback in getNodejsExecutable (~2 lines). Matches standard mode and retires the custom-node.md contradiction.
seaNodePath config key + programmatic option Keep — the only thing env can't do: live in package.json pkg config / be passed to exec({…}).
--sea-node-path CLI flag Optional — redundant with the env var; keep only for --help discoverability.
--sea-use-local-node flag Drop — it's exactly PKG_NODE_PATH="$(command -v node)".
Single-target / platform-match guard Required regardless of interface (see inline on lib/index.ts).

Precedence stays pkg's existing convention: CLI > config (seaNodePath) > PKG_NODE_PATH env.

Net: env var + config key + guard is a smaller change than this PR while covering more — it makes PKG_NODE_PATH work in SEA, which the flags alone don't.

The inline threads below are correctness/doc/test findings that apply whichever interface ships.

Nit (not in this diff): pre-existing typo PriovidedProvided at lib/sea.ts:364, in the explicit-path branch your nodePath override now exercises — worth fixing while here.

Comment thread lib/index.ts Outdated
// Base-node override (both forms are mutually exclusive). nodePath embeds a
// specific binary; useLocalNode embeds the Node running pkg. Either lets you
// ship a SEA built on a custom runtime (e.g. one linked against older glibc).
if (flags.seaNodePath && flags.seaUseLocalNode) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Correctness — multi-target silently bakes the wrong binary. The mutual-exclusion guard is good, but the bigger footgun is unguarded: getNodejsExecutable (lib/sea.ts) returns this single binary for every target regardless of platform/arch. With --targets node22-linux-x64,node22-macos-arm64 --sea-node-path /linux/node, both outputs get the linux binary — the macOS output is silently corrupt. --sea-use-local-node is worse: the host's node gets baked into every cross-platform target.

Add a guard here: when seaNodePath / seaUseLocalNode (or PKG_NODE_PATH) is set, require a single target whose platform+arch match the supplied binary. Standard-mode PKG_NODE_PATH carries the same "single target only" caveat (documented in custom-node.md), but there it falls back to fetching the other platforms — here it produces a broken artifact instead.

Comment thread lib/types.ts Outdated
/**
* SEA mode: path to a base Node binary to embed instead of downloading one
* from nodejs.org. Use to embed a custom build (e.g. one linked against an
* older glibc). Its major version must match the target's.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 This claim isn't enforced. "Its major version must match the target's" — nothing checks it. assertSingleTargetMajor only compares the targets' nodeRange to each other, never to the supplied binary's actual version. So --targets node24 --sea-node-path /node22/bin/node silently produces a working node22 SEA and drops the requested node24.

Either validate major(nodePath --version) === major(targetRange) (cheap — resolveTargetNodeVersion already runs node --version), or reword the doc to say the supplied binary overrides the requested nodeRange.

Comment thread lib/help.ts Outdated
--no-signature skip macOS binary signing [default: sign]
--sea (Experimental) compile given file using node's SEA feature. Requires node v20.0.0 or higher and only single file is supported
--sea-node-path SEA mode: path to a base Node binary to embed instead of downloading one (must match the target's major version)
--sea-use-local-node SEA mode: embed the Node binary running pkg (process.execPath) as the base

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Docs now stale/contradictory. docs-site/guide/custom-node.md currently tells users SEA "does not honour PKG_NODE_PATH … For SEA, use process.execPath … directly." After this PR that page is wrong and needs updating in the same PR. (If you adopt the PKG_NODE_PATH-in-SEA approach from the review summary, this becomes a one-line "now works in SEA too" rather than documenting a second mechanism.)

Comment thread test/test-95-sea-local-node/main.js Outdated
// `--sea-use-local-node` embeds the Node binary running pkg as the SEA base
// instead of downloading one. Exercises the base-node override end to end (and,
// on Node >= 25.5 hosts, the in-core `--build-sea` injection path).
utils.runSeaHostOnly(input, testName, ['--sea-use-local-node']);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 --sea-node-path has no e2e coverage. Only --sea-use-local-node gets a runtime test here; the explicit --sea-node-path branch (the exists() check + node --version probe in lib/sea.ts) is never exercised end to end. Cheap add: a second host-only case passing --sea-node-path = the host's process.execPath to hit that branch.

@cstrahan

Copy link
Copy Markdown
Author

pkg already has a documented way to supply a custom base Node binary — the PKG_NODE_PATH env var (pkg-fetch localPlace()), used by standard mode. It does not flow into SEA today: SEA downloads from nodejs.org via its own path in lib/sea.ts,

Ah, I initially attempted to use PKG_NODE_PATH, from reading the docs at https://www.npmjs.com/package/@yao-pkg/pkg-fetch, thinking that PKG_NODE_PATH might be honored by the SEA code path, and then found (empirically) that this wasn't the case.

and docs-site/guide/custom-node.md explicitly documents that gap ("SEA mode … does not honour PKG_NODE_PATH … For SEA, use process.execPath … directly").

D'oh! I completely missed https://yao-pkg.github.io/pkg/guide/custom-node -- sorry.

I'll see if I can implement the suggestions from review shortly 👍

…back)

Addresses the review on yao-pkg#279 — extend the existing mechanism rather than add a
SEA-only one:

- Honour PKG_NODE_PATH in SEA: fold it into getNodejsExecutable's nodePath
  fallback (and the version resolver), matching standard mode. Precedence is
  CLI (--sea-node-path) > config (seaNodePath) > PKG_NODE_PATH.
- Drop the --sea-use-local-node flag + seaUseLocalNode config key — it was just
  PKG_NODE_PATH="$(command -v node)". Keep --sea-node-path (for --help
  discoverability) and the seaNodePath config / exec() option.
- Required guard: a custom base binary is restricted to a single target whose
  major matches the binary; otherwise pkg errors instead of silently baking the
  wrong binary into other targets' outputs. SEA has no per-platform fetch
  fallback like standard mode, and assertSingleTargetMajor only compares the
  targets to each other, never to the supplied binary.
- Update docs-site/guide/custom-node.md (it previously said SEA does not honour
  PKG_NODE_PATH).
- Tests: drop the seaUseLocalNode unit cases; rework the e2e (now
  test-95-sea-custom-node) to cover both --sea-node-path and PKG_NODE_PATH.
- Fix pre-existing typo "Priovided" -> "Provided" in lib/sea.ts:364.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cstrahan

Copy link
Copy Markdown
Author

@robertsLando I think I've addressed everything from review.

When this is ready, I can squash the history down to a single commit, if you'd like (left the commits split for now, in case that helps with review).

… target

Follow-up to the yao-pkg#279 review. The earlier single-target guard stopped
multi-target runs, but a SINGLE target with a mismatched binary still slipped
through — e.g. `--targets node24-linux-x64 --sea-node-path <macOS node>`
silently produced a corrupt linux output. Per the reviewer's ask ("require a
single target whose platform+arch match the supplied binary"):

- Sniff the supplied binary's container format (ELF / Mach-O / PE) and CPU arch
  from its magic bytes (sniffBinaryTarget) and reject a mismatch against the
  target — a Mach-O binary for a `linux` target, or an x64 binary for `arm64`.
- Reject targets that span more than one distinct platform+arch: one binary
  can't be several at once, including linux vs alpine vs linuxstatic (mutually
  exclusive glibc / musl / static — indistinguishable from the header, but the
  user clearly can't have meant them simultaneously). This replaces the blunt
  single-target-count check with a per-(platform,arch) one.
- The ELF glibc/musl/static flavor isn't in the header, so matching that to
  linux / alpine / linuxstatic remains the user's responsibility (documented).

Tests: a unit test for sniffBinaryTarget (crafted ELF/Mach-O/PE headers), and
unhappy-path e2e cases in test-95-sea-custom-node (multi platform/arch, multi
linux-flavor, wrong platform, wrong major — all host-independent, exit 2).
Updates --help + docs-site/guide/custom-node.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cstrahan

Copy link
Copy Markdown
Author

Sorry, I just just realized I missed part of this:

With --targets node22-linux-x64,node22-macos-arm64 --sea-node-path /linux/node, both outputs get the linux binary — the macOS output is silently corrupt.

Latest commit sniffs the executable's headers to determine container format and architecture, and then I additionally assert that there isn't some disagreement between the given node binary and the requested targets.

With that, I think that's everything.

@robertsLando robertsLando left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertCustomBaseNodeTarget / sniffBinaryTarget — simpler approach

The binary-format sniffing here is more machinery than the problem needs. sniffBinaryTarget parses ELF/Mach-O/PE magic bytes with three machine-code lookup tables (ELF_MACHINE, MACHO_CPU, PE_MACHINE), plus BinaryFormat, formatForPlatform, and FORMAT_LABEL — ~90 lines whose whole job is to guess a binary's platform/arch from its header.

But the base binary is a Node executable, and step 3 already runs it (execFileAsync(binPath, ['--version'])). If we're running it anyway, just ask it everything in one shot:

node -p "process.version + ' ' + process.platform + ' ' + process.arch"

The key point: sniffing's only theoretical edge over running the binary is validating a base that can't run on the build host (a cross-platform base). But step 3's version check already execs the binary unconditionally for a custom path — and so does resolveTargetNodeVersion. So a non-host-runnable base already fails today, before sniffing helps. Sniffing buys nothing the PR actually delivers; it's just inconsistent with the exec already happening (and runs --version twice).

Suggested shape

Keep step 1 (the multi-target guard — simple and useful). Replace the sniff + version exec with one probe:

/** What a Node binary reports about itself when run. */
async function probeNode(
  binPath: string,
): Promise<{ version: string; platform: string; arch: string }> {
  if (binPath === process.execPath) {
    return { version: process.version, platform: process.platform, arch: process.arch };
  }
  const { stdout } = await execFileAsync(binPath, [
    '-p',
    "process.version + ' ' + process.platform + ' ' + process.arch",
  ]);
  const [version, platform, arch] = stdout.trim().split(' ');
  return { version, platform, arch };
}

/** Target-suffix -> the value Node's process.platform reports. */
const PROCESS_PLATFORM: Record<string, string> = {
  macos: 'darwin',
  win: 'win32',
  // linux / alpine / linuxstatic / freebsd report their own name verbatim
};

async function assertCustomBaseNodeTarget(
  targets: (NodeTarget & Partial<Target>)[],
  opts: GetNodejsExecutableOptions,
): Promise<void> {
  const customNode = resolveCustomBaseNode(opts);
  if (!customNode && !opts.useLocalNode) return;
  const binPath = customNode ?? process.execPath;

  // 1. One binary = one platform+arch. (linux/alpine/linuxstatic stay distinct.)
  const combos = new Map<string, { platform: string; arch: string }>();
  for (const t of targets) {
    combos.set(`${t.platform}-${t.arch}`, {
      platform: String(t.platform),
      arch: String(t.arch),
    });
  }
  if (combos.size > 1) {
    throw wasReported(
      `A custom base Node binary applies to a single platform/arch, but the requested ` +
        `targets span ${combos.size}: ${[...combos.keys()].join(', ')}. Run pkg once per target.`,
    );
  }
  const { platform, arch } = [...combos.values()][0];

  // 2. Ask the binary what it actually is.
  const node = await probeNode(binPath);

  const expectedPlatform = PROCESS_PLATFORM[platform] ?? platform;
  if (node.platform !== expectedPlatform) {
    throw wasReported(
      `Custom base Node binary is for "${node.platform}", but target "${platform}" ` +
        `needs "${expectedPlatform}".`,
    );
  }
  if (node.arch !== arch) {
    throw wasReported(
      `Custom base Node binary is ${node.arch}, but target arch is "${arch}".`,
    );
  }

  const targetMajor = parseInt(targets[0].nodeRange.replace('node', ''), 10);
  const binMajor = parseInt(node.version.replace(/^v/, ''), 10);
  if (!Number.isNaN(targetMajor) && binMajor !== targetMajor) {
    throw wasReported(
      `Custom base Node binary is ${node.version} (major ${binMajor}), but target ` +
        `"${targets[0].nodeRange}" requests Node ${targetMajor}.`,
    );
  }
}

Net effect

  • Deletes sniffBinaryTarget, BinaryFormat, ELF_MACHINE / MACHO_CPU / PE_MACHINE, formatForPlatform, FORMAT_LABEL, and the open import — ~90 lines and one whole concept gone.
  • process.platform reports linux for glibc/musl/static alike, so the linux-family caveat is preserved for free; process.arch matches the NodeArch values directly, so no arch table is needed.
  • probeNode can also replace the duplicate --version exec in resolveTargetNodeVersion.

One thing worth stating explicitly in the PR/docs: this makes "the base binary must be runnable on the build host" an honest, documented constraint rather than a half-supported one. That's already true in practice (both code paths exec it), so it isn't a regression — just no longer implying cross-platform bases work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants