Skip to content

Docs: Fix duplicate search results by moving versioned docs out of docs_dir#16371

Merged
kevinjqliu merged 9 commits into
apache:mainfrom
kevinjqliu:fix-site-search-duplicates
May 19, 2026
Merged

Docs: Fix duplicate search results by moving versioned docs out of docs_dir#16371
kevinjqliu merged 9 commits into
apache:mainfrom
kevinjqliu:fix-site-search-duplicates

Conversation

@kevinjqliu
Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu commented May 17, 2026

Problem

Every versioned documentation page appears twice in the site search index.
The root cause is that versioned doc sources live inside the parent MkDocs
project's docs_dir (site/docs/docs/<ver>/docs/). This causes MkDocs to
scan them directly and the mkdocs-monorepo-plugin to inject them at their
canonical URLs — producing two rendered pages and two search entries per page.

See #16368 (review)

Solution

Move the versioned docs worktree from site/docs/docs/ (inside docs_dir) to
site/versioned-docs/ (outside docs_dir). Since the worktree is generated at
build time by dev/common.sh, this is purely a path change in build scripts and
nav configs — no content changes.

With versioned sources outside docs_dir, MkDocs' own file scanner never sees
them. Only the monorepo plugin renders them at their canonical URLs
(docs/<ver>/<page>/), eliminating the duplicate entries entirely.

Verified locally:
showing only 1 result
Screenshot 2026-05-16 at 7 46 09 PM

Changes

  • site/dev/common.sh — worktree created at versioned-docs/ instead of docs/docs/
  • site/nav.yml, site/mkdocs-dev.yml!include paths updated
  • site/README.md — documentation updated to reflect new layout
  • .gitignore — ignore rule updated (site/docs/docs/site/versioned-docs/)

@kevinjqliu
Copy link
Copy Markdown
Contributor Author

@MaxNevermind do you mind verifying this?

@MaxNevermind
Copy link
Copy Markdown
Contributor

@MaxNevermind do you mind verifying this?

I missed this question from yesterday and worked on my own fix for it today 😆 - https://github.com/apache/iceberg/compare/main...MaxNevermind:iceberg:fix-change-default-docs-version?expand=1

Anyway, my solution is shorter but it is more like a workaround implemented through exclusions and your solution actually addresses a root cause. I checked out your branch and it seems to work alright.

@MaxNevermind
Copy link
Copy Markdown
Contributor

btw
From that branch comparison above you can see mine approach to that issue with default landing docs version. I just switched nightly and latest and as latest become the first expandable container on the top it becomes a default landing version. Not sure if that is acceptable solution as It looks a bit off because the order is broken: Latest, Nightly, 1.10.1, 1.9.2 ...

@kevinjqliu
Copy link
Copy Markdown
Contributor Author

Thanks for taking a look @MaxNevermind
I went down the exclusion rabbit hole too. I think the best practice for using the monorepo plugin is to have the versioned docs outside of the docs_dir.

We still need to fix the latest/nightly default issue. Could you repurpose your branch to just include the ordering swap and index fix?

Copy link
Copy Markdown
Contributor

@huaxingao huaxingao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kevinjqliu kevinjqliu merged commit 4bece41 into apache:main May 19, 2026
7 checks passed
@kevinjqliu kevinjqliu deleted the fix-site-search-duplicates branch May 19, 2026 02:27
@kevinjqliu
Copy link
Copy Markdown
Contributor Author

its live and the duplicate search index is fixed!
Screenshot 2026-05-18 at 7 35 58 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants