Skip to content

[GSoC 2026] Add automated translation stub generation for missing reference pages#1473

Draft
aashishpanthi wants to merge 16 commits into
processing:v1from
aashishpanthi:feature/translation-tracker/stub-file-generation
Draft

[GSoC 2026] Add automated translation stub generation for missing reference pages#1473
aashishpanthi wants to merge 16 commits into
processing:v1from
aashishpanthi:feature/translation-tracker/stub-file-generation

Conversation

@aashishpanthi

Copy link
Copy Markdown
Member

Addresses #1404

Changes

This PR implements stub-file generation for the GSoC 2026 Translation Tracker (#1404). When a new English reference page exists without translations, the tracker generates placeholder MDX stubs and opens one PR per language for maintainer review.

Features

Stub generation

  • Detects missing translations for reference content (findMissingTranslations)
  • Generates stub MDX files from English sources:
    • Copies essential frontmatter (title, module, submodule, file, description)
    • Sets needsTranslation: true
    • Adds a placeholder body and HTML comment linking to the English source
  • Opens one PR per language via the GitHub API (blob → tree → commit → branch → pull request)
  • Default languages: es, hi, ko, zh-Hans
  • STUB_MAX_FILES applies per language (default 50), not as a global cap across all languages

GitHub Actions

  • New workflow: .github/workflows/translation-stubs.yml
    • Runs in stub mode (GENERATE_STUBS=true), separate from issue tracking in translation-sync.yml
    • Triggered on push when src/content/reference/en/** changes
    • Supports manual workflow_dispatch with full_scan and languages inputs

Local testing

  • npm run test:stubs — dry-run stub generation into stub-preview/ (no PRs, no src/content/ changes)

Code organization
After discussing with Divyansh, I refactored the translation tracker from a single large index.js into focused modules:

File Re-purposed to
index.js Entry point and orchestration
constants.js Supported languages, content types, stub frontmatter keys
utils.js Path helpers, frontmatter parsing, file scanning
github-tracker.js GitHub API client (issues, diffs, stub PR creation)
workflows.js Translation status checks and stub generation logic

Design decisions

  • Stub mode and issue-tracking mode are mutually exclusive (GENERATE_STUBS=true returns before issue logic), so PR and issue workflows stay independent.
  • Stub PRs are never auto-merged; language stewards and maintainers review them.
  • Reference content only for now; examples, tutorials, etc. can be added later via STUB_CONTENT_TYPES.

Test plan

  • Verified on fork (aashishpanthi/p5.js-website): workflow creates stub PRs when English reference files are added/changed
  • Confirmed up to 4 PRs created (one per language: es, hi, ko, zh-Hans)

PR Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works

To:Do

  • Change the branch name in the workflow file to main. I added feature branch for testing.

@Divyansh013 Divyansh013 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @aashishpanthi , have added some comments

├── test-local.js # Local testing
├── test-local.js # Local issue-tracking test
├── test-stubs.js # Local stub dry-run (writes to stub-preview/)
└── stub-preview/ # Dry-run output (gitignored)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stub-preview/ isn't in any .gitignore, so the recommended npm run test:stubs leaves untracked files (I hit this). Either add .github/actions/translation-tracker/stub-preview/ to .gitignore, or write dry-run output under an already-ignored dir

*/
function generateStubFromEnglish(englishPath, language, contentType = 'reference') {
const raw = fs.readFileSync(englishPath, 'utf8');
const frontmatter = parseFrontmatter(raw, englishPath);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateStubFromEnglish assumes MDX frontmatter, but the scanners also pick up .yaml files. When STUB_CONTENT_TYPES grows beyond reference this will throw and abort the whole batch (no per-file try/catch in the items.map). Either skip .yaml sources for stubbing or handle them explicitly, and wrap the per-file generation so one failure doesn't kill the run

const contentTypes = parseEnvList(process.env.STUB_CONTENT_TYPES, ['reference']);
const fullScan = options.fullScan ?? process.env.STUB_FULL_SCAN === 'true';
const dryRun = process.env.STUB_DRY_RUN === 'true';
const maxFiles = parseInt(process.env.STUB_MAX_FILES || '50', 10);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guard against NaN here, e.g. fall back to 50 if Number.isNaN(maxFiles), so a bad env value doesn't silently generate nothing.

});

if (dryRun || !githubTracker) {
const previewRoot =

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inline previewRoot duplicates the logic in getStubWritePath. Consider deriving the log path from the helper (or path.dirname of the first write) to keep one source of truth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants