chore(ci): restructure workflow with new validation jobs#22401
chore(ci): restructure workflow with new validation jobs#22401Yuiham wants to merge 8 commits intofeature/preview-top-navigationfrom
Conversation
…devDependencies for linting and markdown checks
…ensure all referenced pages exist
…ed file names, internal links, and TOC membership
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @Yuiham, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the documentation quality assurance process by restructuring the continuous integration (CI) workflow. It introduces new validation steps to ensure the integrity of internal links and the consistency of Table of Contents (TOC) files, thereby enhancing the overall reliability and maintainability of the documentation. The changes aim to catch common documentation errors earlier in the development cycle. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
@qiancai PTAL |
There was a problem hiding this comment.
Code Review
This pull request significantly improves the CI workflow by introducing new validation jobs and making the existing scripts more robust. The new script to verify internal links within TOC files is a great addition for maintaining documentation integrity. The changes to use local npm dependencies instead of global ones in the shell scripts are excellent for ensuring reproducible builds.
I've added a few suggestions for the new verify-internal-links-in-toc.js script to improve its maintainability and performance. Overall, this is a solid contribution to improving the quality of the documentation repository.
| if ( | ||
| targetRel.startsWith("tidb-cloud/") && | ||
| !targetRel.startsWith("tidb-cloud/releases/") | ||
| ) { | ||
| const union = new Set(); | ||
| for (const toc of CLOUD_TOC_FILES) { | ||
| const set = tocToPages.get(toc); | ||
| if (!set) continue; | ||
| for (const p of set) union.add(p); | ||
| } | ||
| return { | ||
| ok: union.has(targetRel), | ||
| expectedLabel: "any TiDB Cloud TOC", | ||
| }; | ||
| } |
There was a problem hiding this comment.
The union set of cloud TOC pages is recalculated every time expectedSetForTarget is called for a tidb-cloud/ path. To improve performance, you can pre-calculate this union set once after building the TOC index.
You could modify buildTocIndex to compute and return the union set, and then pass it to expectedSetForTarget.
Example:
// In buildTocIndex
function buildTocIndex(tocFiles) {
// ... existing code to build tocToPages and anyTocPages
const cloudTocPagesUnion = new Set();
for (const toc of CLOUD_TOC_FILES) {
const set = tocToPages.get(toc);
if (!set) continue;
for (const p of set) cloudTocPagesUnion.add(p);
}
return { tocToPages, anyTocPages, cloudTocPagesUnion };
}
// In main
const { tocToPages, anyTocPages, cloudTocPagesUnion } = buildTocIndex(tocFiles);
// In the loop
const { ok, expectedLabel } = expectedSetForTarget(
targetRel,
tocToPages,
anyTocPages,
cloudTocPagesUnion // pass it here
);
// In expectedSetForTarget
function expectedSetForTarget(targetRel, tocToPages, anyTocPages, cloudTocPagesUnion) {
// ...
if (
targetRel.startsWith("tidb-cloud/") &&
!targetRel.startsWith("tidb-cloud/releases/")
) {
return {
ok: cloudTocPagesUnion.has(targetRel),
expectedLabel: "any TiDB Cloud TOC",
};
}
// ...
}| if ( | ||
| targetRel === "_index.md" || | ||
| targetRel.endsWith("/_index.md") || | ||
| targetRel === "_docHome.md" || | ||
| targetRel.endsWith("/_docHome.md") | ||
| ) { | ||
| return { ok: true }; | ||
| } |
| } | ||
|
|
||
| function main() { | ||
| process.chdir(ROOT); |
There was a problem hiding this comment.
The script consistently uses absolute paths constructed from the ROOT variable for file operations (e.g., glob.sync with cwd, path.join(ROOT, ...)). This makes changing the current working directory with process.chdir(ROOT) unnecessary. Removing this call will make the script cleaner and less reliant on global process state.
| if (missingScopePages.length > 0) { | ||
| console.error( | ||
| `TOC check error: ${missingScopePages.length} pages referenced by TOC*.md do not exist on disk.` | ||
| ); | ||
| for (const p of missingScopePages.slice(0, 50)) { | ||
| console.error(`- missing: ${p}`); | ||
| } | ||
| if (missingScopePages.length > 50) { | ||
| console.error(`- ... and ${missingScopePages.length - 50} more`); | ||
| } | ||
| } | ||
|
|
||
| if (violations.length > 0) { | ||
| console.error( | ||
| `TOC check error: ${violations.length} internal doc links point to targets not included in the expected TOC.` | ||
| ); | ||
| for (const v of violations.slice(0, 100)) { | ||
| console.error( | ||
| `- ${v.sourceRel}: ${v.url} (target: ${v.targetRel}; expected: ${v.expectedLabel})` | ||
| ); | ||
| } | ||
| if (violations.length > 100) { | ||
| console.error(`- ... and ${violations.length - 100} more`); | ||
| } | ||
| } |
There was a problem hiding this comment.
The numbers 50 and 100 are used to limit the number of reported errors. It's a good practice to extract such magic numbers into named constants at the top of the file for better readability and maintainability.
For example:
const MAX_MISSING_PAGES_TO_SHOW = 50;
const MAX_VIOLATIONS_TO_SHOW = 100;Then use these constants in the error reporting logic.
…urable verbosity and limits for missing pages and violations
…OC references in violation reports
…on script for better readability
… target extraction and sorting logic
… verification script
|
/retest |
First-time contributors' checklist
What is changed, added or deleted? (Required)
This PR restructures the CI workflow to improve documentation quality checks by:
.github/workflows/ci.yamlto add dedicated jobs for verifying duplicated file names, internal links in TOC files, and TOC membershipscripts/verify-internal-links-in-toc.jsto validate internal links in TOC files and ensure all referenced pages existpackage.jsonfor enhanced linting and markdown checksverify-link-anchors.shandverify-links.shfor better reliabilityWhich TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions.
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?