Skip to content

feat(cli): capture-video on-demand fetcher + capture/lint/producer fixes#1438

Closed
ukimsanov wants to merge 2 commits into
mainfrom
pr/cli-capture-w2h-fixes
Closed

feat(cli): capture-video on-demand fetcher + capture/lint/producer fixes#1438
ukimsanov wants to merge 2 commits into
mainfrom
pr/cli-capture-w2h-fixes

Conversation

@ukimsanov

@ukimsanov ukimsanov commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Context

For the hyperframes.dev website-to-video flow. The capture pipeline needed two things that real-AI-test runs against heygen.com, huly.io, and heygen-showcase surfaced: (1) better signals for logo detection / asset captioning, and (2) a CLI surface to pull the videos the manifest references on demand.

Summary

  • hyperframes capture-video <dir> — on-demand downloader for entries in video-manifest.json. Capture writes the manifest + preview PNGs but skips the mp4s; this pulls one entry at a time by --index N (matched against the entry's index field, not array offset — gaps are possible). SSRF-safe via safeFetch, 250 MB cap, content-type whitelist, race-free exclusive-create write.
  • Capture fixes: structural logo signals (inBanner / inHomeLink / matchesTitleBrand) — class-substring alone caught 0/32 SVGs on heygen.com; content-hash SVG slugs (label-derived slugs mis-attributed partner-logo carousels); SVG→PNG rasterization before Gemini Vision (was hallucinating wordmarks); double-escape \/ inside the page.evaluate template literal (Unexpected token ^ blocked every capture).
  • Lint fixes: findRootTag masks comment/style/script ranges (a <video> token inside a CSS comment was being picked as root); new lintMissingLocalAsset rule (the most common sub-agent mistake across multi-URL runs).
  • Producer fix: ESM banner shims __dirname / __filename — ffmpeg/Emscripten wasm glue does scriptDirectory = __dirname + "/" and was crashing renders.

Removed from this PR

  • hyperframes sfx + hyperframes music — per James: HeyGen CLI already covers this; not duplicating the API surface.

Verification

  • bun run --filter @hyperframes/cli typecheck passes
  • bun run --filter @hyperframes/cli test — 794/794 (incl. 17 new tests for capture-video.ts: safeFilename decoding/sanitization, VIDEO_CONTENT_TYPE_RE accept/reject, pickManifestEntry index-field lookup with gaps, URL-mismatch + bad-index rejection, --index over --url priority)
  • node packages/producer/build.mjs esbuild bundles complete cleanly (no missing-worker reference)

Test plan

  • bun install clean, bun run build succeeds
  • hyperframes capture <url> produces <dir>/extracted/video-manifest.json; hyperframes capture-video <dir> --list finds it and --index 0 downloads cleanly
  • Snippet output from capture-video passes hyperframes lint (id attribute present)
  • Producer render of an existing project completes without __dirname errors

Comment thread packages/cli/src/commands/capture-video.ts Fixed
Comment thread packages/cli/src/commands/capture-video.ts Fixed
@ukimsanov ukimsanov force-pushed the pr/cli-capture-w2h-fixes branch 3 times, most recently from 9795334 to 5db1909 Compare June 14, 2026 13:59
Comment thread packages/cli/src/utils/lintProject.ts Fixed
@ukimsanov ukimsanov force-pushed the pr/cli-capture-w2h-fixes branch from 5db1909 to e9d3e7c Compare June 14, 2026 21:02
Comment thread packages/cli/src/commands/capture-video.ts Fixed
Comment thread packages/cli/src/utils/lintProject.ts Fixed
// race (CodeQL js/file-system-race) so a concurrent run can't have its
// download silently clobbered here.
try {
writeFileSync(outPath, buf, { flag: "wx" });
…st runs

Surgical bugfixes accumulated across a series of real-AI-test runs
(heygen.com, huly.io, heygen-showcase). Each fix targets a specific
observed defect; happy paths are untouched.

packages/cli/src/capture/

  • assetCataloger.ts: surface three structural logo signals on every
    cataloged asset (inBanner / inHomeLink / matchesTitleBrand). The
    prior class-substring-only isLogo detector caught 0/32 SVGs on
    heygen.com and 0/19 on huly.io — modern React/Tailwind builds
    don't put "logo" or "brand" in any className. The new signals
    catch the universal "site header logo" pattern. Boolean merge
    semantics: any positive sample wins through context-merge +
    srcset dedup.

  • tokenExtractor.ts: broaden inline-SVG isLogo via the same three
    structural signals (header/nav/role=banner ancestor, root-href
    anchor parent, document.title brand-segment match in aria-label).
    No change to the existing class-substring detector — runs first,
    new heuristics only fire when it misses.

  • assetDownloader.ts: content-hash SVG slugs. SVG filenames are now
    `svg-<8char-sha1>.svg` (or `logo-<hash>.svg` when isLogo flags
    fire), replacing the previous label-derived slugging that
    mis-attributed brand carousels. Verified by rasterizing real
    captured SVGs: heygen-logo.svg actually contained the Google
    wordmark, hubspot-logo.svg contained Trivago, huly-logo.svg
    contained "Kube", heygen-logo.svg → "oogo". Catalog → URL label
    inference (aria-label / nearest-heading / sectionClasses) is too
    drift-prone across partner-logo carousels; content-hash names are
    invariant by construction.

  • contentExtractor.ts: SVG→PNG rasterization via sharp before
    sending to Gemini Vision. Previous path sent raw SVG markup as
    text and hit pure-hallucination output on wordmarks (VIVIENNE
    for HubSpot, "wrestling" for Workday). Vision models can read
    PNG pixels reliably; they cannot mental-render path commands.
    Adds polarity detection (white-glyph vs dark-glyph) so an SVG
    that flattens to a blank PNG against the wrong background gets
    inverted automatically before captioning.

  • contentExtractor.ts: LOGO tag in asset-descriptions.md lines
    when the structural signals fire (independent of Gemini). The
    no-Gemini-key fallback still emits an ⚠ banner + the LOGO-tagged
    lines so agents can grep for logos via filename pattern even
    without Vision.

  • index.ts: asset-descriptions.md header branches on Gemini-key
    presence with an explicit "Vision was OFF, descriptions are
    catalog-derived" warning + a fallback recipe ("open LOGO-tagged
    SVGs in a previewer before referencing"). Progress message also
    reports catalog-fallback mode.

  • capture/assetCataloger.ts + capture/tokenExtractor.ts regex
    escape: `/^https?:\\/\\/[^/]+\\/?$/` inside the page.evaluate
    template literal. The original `/^https?:\/\/[^/]+\/?$/` was
    collapsing `\/` to `/` inside the template (because backslash
    before a non-escape char is consumed), producing a parse error
    on every capture. Capture against heygen.com and huly.io both
    100% blocked on this until the escape was fixed.

packages/core/src/lint/utils.ts

  • findRootTag masks <!-- ... -->, <style>...</style>, and
    <script>...</script> ranges before tag extraction. A literal
    <video> token inside a CSS comment (`/* The card uses <video>
    as the surface */`) inside a <style> block was being picked as
    the composition root, producing two cascading false errors
    (root_missing_composition_id + root_missing_dimensions).
    Verified against a synthetic repro plus the real beat that hit
    this. Existing stripJsComments / extractScriptTextsAndSrcs
    exports preserved — earlier work-in-progress commits had
    accidentally removed them; they're consumers of these helpers
    in lint/rules/{adapters,composition,core,gsap}.ts.

packages/cli/src/utils/lintProject.ts

  • New lintMissingLocalAsset rule: scans <video>/<img>/<source>
    src attributes for local files that don't exist in the project.
    Uses resolveExistingLocalAsset (same helper stylesheet lint
    uses) so the existence check matches the bundler's notion of
    "resolves" — handles root-absolute "/assets/foo.png" relative
    to projectDir, rejects "../outside.png" that escapes the
    project. The renderer otherwise 404s these silently and ships
    a video with missing visuals. Empirically the most common
    sub-agent mistake across multi-URL runs (~5+ per run).

packages/producer/build.mjs

  • ESM banner shims __dirname / __filename from import.meta.url
    alongside the existing createRequire shim. Bundled CJS deps —
    notably the ffmpeg/Emscripten wasm glue, which does
    `scriptDirectory = __dirname + "/"` — were crashing at render
    time with "__dirname is not defined in ES module scope".
    Verified by re-running a producer render after rebuild;
    ffmpeg pipeline completes cleanly.

All targeted tests pass (255/255 across capture / lint / lintProject
/ sfx unit tests). Typecheck clean for @hyperframes/core and
@hyperframes/cli. No happy-path behavior change; each fix targets
a specific observed failure.
@ukimsanov ukimsanov force-pushed the pr/cli-capture-w2h-fixes branch from e9d3e7c to 29af3e1 Compare June 14, 2026 21:13

@miguel-heygen miguel-heygen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additive review — github-advanced-security[bot] commented with 6 CodeQL findings (598/604 TOCTOU, 597/603 network-to-file, 601/602 bad HTML regexp). Assessing all of them.

Strengths

  • capture-video.ts: TOCTOU addressed with { flag: "wx" } atomic exclusive create — the right fix (reading the CodeQL recommendation, not patching the symptom). The comment explaining why existsSync was removed is exactly what a reviewer needs.
  • sfx/add.ts + music/add.ts: dual-layer path traversal guard — alphanumeric ID validation AND startsWith(sfxRoot + sep) containment check. Defense-in-depth, consistent across both commands.
  • lintProject.ts maskNonScannableRanges: uses <\/script\b[^>]*> (permissive form) with an inline comment that cites CodeQL js/bad-tag-filter by name and explains why it needs this shape.
  • lintMissingLocalAsset test suite: 9 cases covering the non-obvious edges — remote URLs, data:/blob:, template placeholders __VIDEO_SRC__, sub-comp relative path rewrite, cross-composition dedup. This is thorough.
  • producer/build.mjs __dirname/__filename shim: precise ESM fix; comment explains the ffmpeg/Emscripten wasm scenario that triggers it.
  • Presigned URL refresh in sfx/add.ts (re-runs the cached query when the URL is stale) is a good robustness call — expires-in-15-min URLs would otherwise silently 403.

CodeQL assessment

Finding Verdict Reasoning
598/604 — TOCTOU False positive { flag: "wx" } IS the fix CodeQL recommends. No existsSync precheck exists.
597/603 — Network to file False positive in context CLI writing a user-requested file to a user-specified path via safeFetch (SSRF guard + 250 MB cap + content-type whitelist). Not a web server handling attacker-controlled paths.
601/602 — Bad HTML regexp False positive <\/script\b[^>]*> matches </script > and </script\t\nbar>. The scanner appears to have flagged a prior commit state; the current branch already uses the permissive form.

Nit

packages/cli/src/commands/sfx/auth.ts:41 — the c.accent() call receives a regular double-quoted string that contains ${c.dim(...)} as literal characters, not interpolated sub-expressions. Terminal output for the "then add" line will render the raw template expression text instead of colored substrings. Cosmetic only, non-blocking.

Fix:

console.log(
  `  ${c.dim("then add")} ${c.accent("HEYGEN_API_KEY=<your-key>")} ${c.dim("to a .env file, or run")} ${c.accent("hyperframes auth login")}${c.dim(".")}`,
);

Audited: capture-video.ts, sfx/add.ts, sfx/auth.ts, sfx/analyze.ts, sfx/search.ts, sfx/state.ts, sfx/inspect.ts, sfx/list.ts, sfx/catalog-manifest.ts, music/add.ts, lintProject.ts, lintProject.test.ts, core/lint/utils.ts, producer/build.mjs, cli.ts, help.ts
Trusting: _gen/client.ts + types.ts (regenerated, read integration points only), music/search.ts, music/inspect.ts, music/state.ts (mirror of sfx equivalents), capture/assetCataloger.ts + tokenExtractor.ts (read key changes only)

Verdict: APPROVE
Reasoning: No blockers. All 6 CodeQL findings are false positives in context — TOCTOU was addressed with the correct atomic write, network-to-file is guarded CLI I/O, and the HTML regexp is already the permissive form CodeQL requires. One cosmetic nit in the SFX auth help text.

— magi

@jrusso1020 jrusso1020 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed on slack

Adds `hyperframes capture-video <project>` which downloads a single
video from the capture's video-manifest.json on demand. The capture
pipeline writes the manifest + preview PNGs but deliberately skips
the mp4s — a site with 12+ feature videos would balloon the capture
from ~5 MB to hundreds of MB. This command pulls one entry at a
time, addressed by --index N (matched against the manifest entry's
own `index` field, not array offset — the manifest can have gaps).

  • SSRF-safe via `safeFetch` from capture/assetDownloader (rejects
    private/metadata hosts, re-validates redirects).
  • Content-type whitelist: `video/*` + a small set of common
    `application/*` variants. Anything else (HTML error pages,
    JSON, tracking pixels) aborts cleanly.
  • 250 MB hard cap on Content-Length AND body size.
  • Filename sanitization: percent-decoded then anything outside
    [A-Za-z0-9._-] stripped.
  • Race-free write: `flag: "wx"` atomic exclusive-create with
    EEXIST handled as 'already downloaded' (no upstream existsSync
    precheck — eliminates the TOCTOU pattern).
  • Dual-layout aware: checks `<dir>/extracted/` (standalone
    capture) and `<dir>/capture/extracted/` (W2H project layout).
  • Suggested embed snippet includes `id="video-${entry.index}"`
    so the output passes the producer's media discovery / lint.

Registered in cli.ts + help.ts.
@ukimsanov ukimsanov force-pushed the pr/cli-capture-w2h-fixes branch from 29af3e1 to 9db1b9d Compare June 14, 2026 21:55
@ukimsanov ukimsanov changed the title feat(cli) + fix(capture/lint/producer): on-demand video fetch, music + sfx commands, pipeline robustness feat(cli): capture-video on-demand fetcher + capture/lint/producer fixes Jun 14, 2026
@ukimsanov

Copy link
Copy Markdown
Collaborator Author

Split into 3 package-scoped PRs per James's review feedback (sfx + music dropped, then per his stack pattern):

All three are independent. Closing this in favor of the smaller PRs.

@ukimsanov ukimsanov closed this Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants