Skip to content

chore(ci): restructure workflow with new validation jobs#22401

Open
Yuiham wants to merge 8 commits intofeature/preview-top-navigationfrom
chore/nextgen-ci
Open

chore(ci): restructure workflow with new validation jobs#22401
Yuiham wants to merge 8 commits intofeature/preview-top-navigationfrom
chore/nextgen-ci

Conversation

@Yuiham
Copy link
Collaborator

@Yuiham Yuiham commented Feb 4, 2026

First-time contributors' checklist

What is changed, added or deleted? (Required)

This PR restructures the CI workflow to improve documentation quality checks by:

  • Restructured CI workflow: Reorganized the .github/workflows/ci.yaml to add dedicated jobs for verifying duplicated file names, internal links in TOC files, and TOC membership
  • Added new verification script: Created scripts/verify-internal-links-in-toc.js to validate internal links in TOC files and ensure all referenced pages exist
  • Updated dependencies: Added new devDependencies in package.json for enhanced linting and markdown checks
  • Improved link verification scripts: Modified verify-link-anchors.sh and verify-links.sh for better reliability

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v8.5 (TiDB 8.5 versions)
  • v8.4 (TiDB 8.4 versions)
  • v8.3 (TiDB 8.3 versions)
  • v8.2 (TiDB 8.2 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)
  • v5.3 (TiDB 5.3 versions)

What is the related PR or file link(s)?

  • This PR is translated from: N/A
  • Other reference link(s): N/A

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

@ti-chi-bot
Copy link

ti-chi-bot bot commented Feb 4, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign oreoxmt for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added missing-translation-status This PR does not have translation status info. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 4, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Yuiham, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly upgrades the documentation quality assurance process by restructuring the continuous integration (CI) workflow. It introduces new validation steps to ensure the integrity of internal links and the consistency of Table of Contents (TOC) files, thereby enhancing the overall reliability and maintainability of the documentation. The changes aim to catch common documentation errors earlier in the development cycle.

Highlights

  • CI Workflow Restructuring: The CI workflow has been reorganized to include dedicated jobs for enhanced documentation quality checks, specifically for verifying duplicated file names, internal links in TOC files, and TOC membership.
  • New Internal Link Validation Script: A new JavaScript script, scripts/verify-internal-links-in-toc.js, has been introduced. This script validates internal links within Table of Contents (TOC) files, ensuring that all referenced pages exist and are correctly included in their respective TOCs.
  • Dependency Updates for Linting and Link Checking: New development dependencies have been added to package.json, including @breeswish-org/remark-lint-pingcap-docs-anchor, markdown-link-check, remark-cli, and remark-lint, to support the enhanced linting and markdown checks.
  • Improved Link Verification Scripts: The existing scripts/verify-link-anchors.sh and scripts/verify-links.sh have been modified to improve reliability. They now prioritize using locally installed versions of remark and markdown-link-check for better reproducibility and include more robust path exclusion logic.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • package.json
    • Added new devDependencies for markdown linting and link checking tools: @breeswish-org/remark-lint-pingcap-docs-anchor, markdown-link-check, remark-cli, and remark-lint.
  • scripts/verify-internal-links-in-toc.js
    • New file added to implement comprehensive verification of internal links within TOC files.
    • Checks if pages referenced in TOCs actually exist on disk.
    • Validates that internal document links point to targets that are correctly included in the expected TOCs, with special handling for cloud-specific and prefix-based TOCs.
  • scripts/verify-link-anchors.sh
    • Modified to dynamically determine the remark command, preferring a repo-local installation for improved reproducibility.
    • Added --ignore-pattern '.*/**' to the remark command for more precise filtering of ignored paths.
    • Removed global npm install commands for linting tools, relying on devDependencies.
  • scripts/verify-links.sh
    • Modified to dynamically determine the markdown-link-check command, prioritizing a repo-local installation for better reproducibility.
    • Updated the find command to exclude additional paths (./tmp/*, ./.*/*) when searching for markdown files, enhancing the accuracy of link checking.
    • Removed global npm install for markdown-link-check, relying on devDependencies.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/ci.yaml
Activity
  • The pull request was created by Yuiham with the title 'chore(ci): restructure workflow with new validation jobs'.
  • The author provided a detailed description outlining the restructuring of the CI workflow, the addition of new verification scripts, dependency updates, and improvements to existing link verification scripts.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Yuiham
Copy link
Collaborator Author

Yuiham commented Feb 4, 2026

@qiancai PTAL

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the CI workflow by introducing new validation jobs and making the existing scripts more robust. The new script to verify internal links within TOC files is a great addition for maintaining documentation integrity. The changes to use local npm dependencies instead of global ones in the shell scripts are excellent for ensuring reproducible builds.

I've added a few suggestions for the new verify-internal-links-in-toc.js script to improve its maintainability and performance. Overall, this is a solid contribution to improving the quality of the documentation repository.

Comment on lines 107 to 121
if (
targetRel.startsWith("tidb-cloud/") &&
!targetRel.startsWith("tidb-cloud/releases/")
) {
const union = new Set();
for (const toc of CLOUD_TOC_FILES) {
const set = tocToPages.get(toc);
if (!set) continue;
for (const p of set) union.add(p);
}
return {
ok: union.has(targetRel),
expectedLabel: "any TiDB Cloud TOC",
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The union set of cloud TOC pages is recalculated every time expectedSetForTarget is called for a tidb-cloud/ path. To improve performance, you can pre-calculate this union set once after building the TOC index.

You could modify buildTocIndex to compute and return the union set, and then pass it to expectedSetForTarget.

Example:

// In buildTocIndex
function buildTocIndex(tocFiles) {
  // ... existing code to build tocToPages and anyTocPages
  
  const cloudTocPagesUnion = new Set();
  for (const toc of CLOUD_TOC_FILES) {
    const set = tocToPages.get(toc);
    if (!set) continue;
    for (const p of set) cloudTocPagesUnion.add(p);
  }

  return { tocToPages, anyTocPages, cloudTocPagesUnion };
}

// In main
const { tocToPages, anyTocPages, cloudTocPagesUnion } = buildTocIndex(tocFiles);

// In the loop
const { ok, expectedLabel } = expectedSetForTarget(
  targetRel,
  tocToPages,
  anyTocPages,
  cloudTocPagesUnion // pass it here
);

// In expectedSetForTarget
function expectedSetForTarget(targetRel, tocToPages, anyTocPages, cloudTocPagesUnion) {
  // ...
  if (
    targetRel.startsWith("tidb-cloud/") &&
    !targetRel.startsWith("tidb-cloud/releases/")
  ) {
    return {
      ok: cloudTocPagesUnion.has(targetRel),
      expectedLabel: "any TiDB Cloud TOC",
    };
  }
  // ...
}

Comment on lines +98 to +105
if (
targetRel === "_index.md" ||
targetRel.endsWith("/_index.md") ||
targetRel === "_docHome.md" ||
targetRel.endsWith("/_docHome.md")
) {
return { ok: true };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

This block of code is unreachable. The check for special targets like _index.md and _docHome.md is already performed in the main function at line 162 before this function is called. You can safely remove this redundant check to improve code clarity.

}

function main() {
process.chdir(ROOT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

The script consistently uses absolute paths constructed from the ROOT variable for file operations (e.g., glob.sync with cwd, path.join(ROOT, ...)). This makes changing the current working directory with process.chdir(ROOT) unnecessary. Removing this call will make the script cleaner and less reliant on global process state.

Comment on lines 177 to 201
if (missingScopePages.length > 0) {
console.error(
`TOC check error: ${missingScopePages.length} pages referenced by TOC*.md do not exist on disk.`
);
for (const p of missingScopePages.slice(0, 50)) {
console.error(`- missing: ${p}`);
}
if (missingScopePages.length > 50) {
console.error(`- ... and ${missingScopePages.length - 50} more`);
}
}

if (violations.length > 0) {
console.error(
`TOC check error: ${violations.length} internal doc links point to targets not included in the expected TOC.`
);
for (const v of violations.slice(0, 100)) {
console.error(
`- ${v.sourceRel}: ${v.url} (target: ${v.targetRel}; expected: ${v.expectedLabel})`
);
}
if (violations.length > 100) {
console.error(`- ... and ${violations.length - 100} more`);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

The numbers 50 and 100 are used to limit the number of reported errors. It's a good practice to extract such magic numbers into named constants at the top of the file for better readability and maintainability.

For example:

const MAX_MISSING_PAGES_TO_SHOW = 50;
const MAX_VIOLATIONS_TO_SHOW = 100;

Then use these constants in the error reporting logic.

@qiancai
Copy link
Collaborator

qiancai commented Feb 9, 2026

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

missing-translation-status This PR does not have translation status info. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants