Skip to content

feat(metadata): new meta format#651

Open
avivkeller wants to merge 8 commits intomainfrom
new-metadata-format
Open

feat(metadata): new meta format#651
avivkeller wants to merge 8 commits intomainfrom
new-metadata-format

Conversation

@avivkeller
Copy link
Member

This PR replaces the current metadata format (built from an internalMetadata, a metadata Class1, and finally a generated object) to a new format, which is simply the generated object, handled directly in the metadata generator.

The previous implementation, while a solid one that certainly served it's purpose, has grown to have a number of issues, including:

  1. It's very rigid. Because the metadata is generated from a predefined internal structure, additional keys (i.e. authors or categories) are not supported, and will simply not be picked up by doc-kit. This makes frontmatter on our Learn docs impossible. This new format, on the other-hand, fills a would-be almost empty object with the YAML properties, rather than using an internal structure, meaning any extra properties will be appended to the metadata.

  2. It's not 1:1 with the input YAML. As I mentioned above, our metadata follows a pre-defined internal structure. That includes the YAML keys, which are renamed (e.g. added -> added_in). This new metadata uses the keys exactly as they appear in the YAML.

  3. It's large. Our metadata has several redundancies, making it large (which means more data to serialize, and ultimately, slower builds). For instance, api_doc_source and yaml_position are barely used, and slug + heading.depth both exist in multiple places. The new metadata consolidates.

  4. It's types are global. This isn't really an issue, more of an inconvenience in the way VSCode's IntelliSense works. VSCode (or, at least, my VSCode) will only pick-up global types.d.ts if the file is open in the editor. This means that one needs to always have the metadata types open, as VSCode will not automatically import them. Moving these to imports allows IntelliSense to work more out-of-the-box.

    As an additional note on 4), this new Metadata format has more robust types, meaning more functions have more accurate types, especially for Stability and Heading nodes

  5. It's not all in the metadata generator. The metadata generation is kinda all over the place. Part of it is in the ast generator, part in the metadata generator, and more in the utils folder. This means that other generators trying to load code from utils may load un-needed metadata generation code. For instance, createQueries() is really only needed in the metadata generator. Everything else in that file can be a constant... so now they are.

So, the new metadata format is as follows:

export interface MetadataEntry extends YAMLProperties {
  /** API identifier/name */
  api: string;
  /** Processed heading with metadata */
  heading: HeadingNode;
  /** Stability classification information */
  stability: StabilityNode;
  /** Main content as markdown AST */
  content: Root;
}

Where YAMLProperties includes everything from the YAML (e.g. changes, llm_description, etc). The heading node and stability nodes are not recreated root nodes, but rather their original nodes. This means that metadata from the original node does not need to be cloned and propagated to new nodes.

Footnotes

  1. Well, a function that acts like a class

@avivkeller avivkeller requested a review from a team as a code owner March 6, 2026 21:19
Copilot AI review requested due to automatic review settings March 6, 2026 21:19
@vercel
Copy link

vercel bot commented Mar 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
api-docs-tooling Ready Ready Preview Mar 6, 2026 9:56pm

Request Review

@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 80.62756% with 142 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.76%. Comparing base (acb4c34) to head (543e858).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...nerators/metadata/utils/__tests__/transformers.mjs 0.00% 29 Missing ⚠️
src/generators/legacy-html/utils/buildContent.mjs 0.00% 27 Missing ⚠️
src/generators/legacy-json/utils/buildSection.mjs 0.00% 23 Missing ⚠️
src/generators/ast/generate.mjs 0.00% 10 Missing ⚠️
...c/generators/legacy-html/utils/tableOfContents.mjs 0.00% 8 Missing ⚠️
src/generators/metadata/utils/visitors.mjs 94.77% 8 Missing ⚠️
...generators/legacy-html/utils/buildExtraContent.mjs 0.00% 7 Missing ⚠️
...rc/generators/jsx-ast/utils/getSortedHeadNodes.mjs 0.00% 4 Missing ⚠️
src/generators/man-page/generate.mjs 0.00% 4 Missing ⚠️
src/generators/metadata/utils/parse.mjs 92.85% 4 Missing ⚠️
... and 13 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
- Coverage   75.90%   74.76%   -1.15%     
==========================================
  Files         145      146       +1     
  Lines       13735    13268     -467     
  Branches      992      961      -31     
==========================================
- Hits        10426     9920     -506     
- Misses       3303     3342      +39     
  Partials        6        6              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the metadata pipeline to use a single generated metadata object (1:1 with YAML/frontmatter) and moves metadata-specific parsing/transforms into the metadata generator, eliminating the previous internalMetadata/class-like builder and the createQueries() manager.

Changes:

  • Replace the old createMetadata/ApiDocMetadataEntry format with the new MetadataEntry shape and update generators/types accordingly.
  • Split query helpers into exported QUERIES/UNIST constants and relocate metadata parsing logic into dedicated metadata/utils/* modules (yaml/visitors/transformers).
  • Update generators and tests to consume the new metadata keys (e.g. added, deprecated, removed) and new node shapes.

Reviewed changes

Copilot reviewed 45 out of 59 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/utils/queries/utils.mjs Switch typed-list detection to use exported QUERIES constant.
src/utils/queries/index.mjs Remove createQueries(); export QUERIES and UNIST constants for reuse across generators.
src/utils/queries/constants.mjs Remove stability permalink constant (moved to AST generator constants).
src/utils/queries/tests/index.test.mjs Update tests to target exported UNIST helpers.
src/utils/parser/constants.mjs Delete parser constants; moved into metadata generator constants.
src/utils/generators.mjs Update types and adjust API doc URL building for new metadata shape.
src/utils/configuration/types.d.ts Update changelog entry type to the new ReleaseEntry.
src/types.d.ts Remove global types in favor of imported/local types.
src/parsers/types.d.ts Add ReleaseEntry type for changelog parsing/config typing.
src/parsers/markdown.mjs Update JSDoc return type to ReleaseEntry[].
src/metadata.mjs Remove legacy metadata builder implementation.
src/generators/sitemap/utils/createPageSitemapEntry.mjs Retype input to MetadataEntry.
src/generators/sitemap/types.d.ts Update generator type signature to MetadataEntry[].
src/generators/orama-db/utils/title.mjs Retype helper params to MetadataEntry[].
src/generators/orama-db/types.d.ts Update generator type signature to MetadataEntry[].
src/generators/orama-db/generate.mjs Update Orama document href generation to new metadata structure.
src/generators/metadata/utils/yaml.mjs New module: YAML extraction/normalization/parsing for metadata generator.
src/generators/metadata/utils/visitors.mjs New module: AST visitors for links, type refs, stability, YAML merging.
src/generators/metadata/utils/transformers.mjs Refactor heading/type transformations; move constants import to metadata generator constants.
src/generators/metadata/utils/slugger.mjs Update slugger to import replacements from metadata generator constants.
src/generators/metadata/utils/parse.mjs Rewrite metadata parsing to directly build MetadataEntry objects per heading section.
src/generators/metadata/utils/tests/yaml.test.mjs Retarget YAML tests to new yaml utility module.
src/generators/metadata/utils/tests/transformers.mjs New tests for transformTypeToReferenceLink.
src/generators/metadata/utils/tests/parse.test.mjs Update metadata parsing tests to new field names/node shapes.
src/generators/metadata/types.d.ts Define the new MetadataEntry/YAMLProperties/node types.
src/generators/metadata/constants.mjs Consolidate previously parser-level constants into metadata generator.
src/generators/man-page/utils/converter.mjs Retype converter inputs to MetadataEntry.
src/generators/man-page/types.d.ts Update generator type signature to MetadataEntry[].
src/generators/man-page/generate.mjs Update section slicing logic to new heading/stability shapes.
src/generators/llms-txt/utils/buildApiDocLink.mjs Retype inputs to MetadataEntry.
src/generators/llms-txt/utils/tests/buildApiDocLink.test.mjs Update tests for new link-building behavior/inputs.
src/generators/llms-txt/types.d.ts Update generator type signature to MetadataEntry[].
src/generators/legacy-json/utils/parseList.mjs Switch typed-list parsing to QUERIES/UNIST exports.
src/generators/legacy-json/utils/buildSection.mjs Update legacy JSON section building for new metadata keys/node shapes.
src/generators/legacy-json/utils/buildHierarchy.mjs Retype hierarchy helpers to MetadataEntry.
src/generators/legacy-json/types.d.ts Update generator/entry types to build on MetadataEntry.
src/generators/legacy-html/utils/tableOfContents.mjs Update depth handling to heading.depth and retype inputs.
src/generators/legacy-html/utils/buildExtraContent.mjs Update stability filtering to new stability presence semantics.
src/generators/legacy-html/utils/buildDropdowns.mjs Retype versions list to ReleaseEntry[].
src/generators/legacy-html/utils/buildContent.mjs Update content building to new metadata keys and QUERIES/UNIST.
src/generators/legacy-html/types.d.ts Update generator type signature to MetadataEntry[].
src/generators/legacy-html-all/types.d.ts Update all-in-one generator input typing to match template values.
src/generators/legacy-html-all/generate.mjs Adjust aggregation inputs for new legacy-html output typing.
src/generators/jsx-ast/utils/types.mjs Switch to using QUERIES/UNIST exports.
src/generators/jsx-ast/utils/signature.mjs Update typed-list detection and type annotations for new heading types.
src/generators/jsx-ast/utils/getSortedHeadNodes.mjs Retype helpers and keep head-node selection via heading.depth.
src/generators/jsx-ast/utils/buildContent.mjs Retype entry params and switch typed-list/stability visitors to UNIST.
src/generators/jsx-ast/utils/buildBarProps.mjs Retype and adjust ToC/meta bar props generation for new metadata keys.
src/generators/jsx-ast/utils/tests/buildBarProps.test.mjs Update tests to new lifecycle key names (added).
src/generators/jsx-ast/types.d.ts Update generator type signature to MetadataEntry[].
src/generators/jsx-ast/generate.mjs Update generator orchestration types and grouping for new metadata format.
src/generators/jsx-ast/constants.mjs Rename lifecycle label keys to new YAML-aligned names.
src/generators/json-simple/types.d.ts Update generator type signature to MetadataEntry[].
src/generators/json-simple/generate.mjs Switch stability/heading removal logic to UNIST exports.
src/generators/ast/generate.mjs Inline stability-index link rewrite using exported regex and new constant.
src/generators/ast/constants.mjs New constant for stability index URL.
src/generators/addon-verify/types.d.ts Update generator type signature to MetadataEntry[].
src/tests/metadata.test.mjs Remove tests for deleted legacy metadata builder.
docs/generators.md Update documentation types to reference MetadataEntry.
Comments suppressed due to low confidence (3)

src/generators/jsx-ast/utils/buildBarProps.mjs:41

  • shouldIncludeEntryInToC is checking heading.data.depth, but depth is now read from the heading node itself (heading.depth). With the current code, heading.data.depth is usually undefined, causing the depth filter to fail and the ToC to come out empty. Use heading.depth here (and keep it consistent with the rest of the new metadata shape).
const shouldIncludeEntryInToC = ({ heading }) =>
  // Only include headings with text,
  heading?.data?.text.length &&
  // and whose depth <= the maximum allowed.
  heading?.data?.depth <= TOC_MAX_HEADING_DEPTH;

src/generators/legacy-json/utils/buildSection.mjs:114

  • parseStability() is still treating stability like the old { type:'root', children:[blockquote] } wrapper. stability.children[0] is a paragraph inside the blockquote, so indexOf() returns -1 and the subsequent splice can remove the wrong node. Update this logic to locate/remove the blockquote node itself (and also check stabilityIdx !== -1 rather than relying on truthiness).
  const parseStability = (section, nodes, { stability, content }) => {
    if (stability) {
      section.stability = Number(stability.data.index);
      section.stabilityText = stability.data.description;

      const stabilityIdx = content.children.indexOf(stability.children[0]);

      if (stabilityIdx) {
        nodes.splice(stabilityIdx - 1, 1);
      }

src/generators/legacy-html/utils/buildExtraContent.mjs:19

  • buildStabilityOverview is still reading stability data from stability.children[0].data, but stability is now stored as the blockquote node with its metadata on stability.data (see visitStability). As written, data.index/data.description will be undefined and the stability table will break. Use stability.data instead of destructuring from children.
  const headNodesWithStability = headMetadata.filter(entry => entry.stability);

  const mappedHeadNodesIntoTable = headNodesWithStability.map(
    ({ heading, api, stability }) => {
      // Retrieves the first Stability Index, as we only want to use the first one
      // to generate the Stability Overview
      const [{ data }] = stability.children;


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

github-actions bot commented Mar 6, 2026

orama-db Generator

File Base Head Diff
orama-db.json 8.05 MB 8.04 MB -1.78 KB (-0.02%)

web Generator

File Base Head Diff
diagnostics_channel.js 228.45 KB 228.46 KB +13.00 B (+0.01%)
single-executable-applications.js 86.36 KB 86.37 KB +13.00 B (+0.01%)
test.js 862.58 KB 862.59 KB +13.00 B (+0.00%)
diagnostics_channel.html 234.19 KB 234.20 KB +8.00 B (+0.00%)
single-executable-applications.html 105.61 KB 105.62 KB +8.00 B (+0.01%)
test.html 735.66 KB 735.67 KB +8.00 B (+0.00%)
module.html 325.19 KB 325.19 KB -2.00 B (-0.00%)
module.js 337.48 KB 337.48 KB -2.00 B (-0.00%)

@avivkeller
Copy link
Member Author

As shown by the actions comment above, this new meta is 1:1 in output with the previous one :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants