Skip to content

feat: use sublevel llms.txt per product for MCP resources#27

Open
mattpodwysocki wants to merge 8 commits intomainfrom
feat/sublevel-llms-txt-resources
Open

feat: use sublevel llms.txt per product for MCP resources#27
mattpodwysocki wants to merge 8 commits intomainfrom
feat/sublevel-llms-txt-resources

Conversation

@mattpodwysocki
Copy link
Copy Markdown
Contributor

@mattpodwysocki mattpodwysocki commented Apr 14, 2026

Summary

docs.mapbox.com restructured its llms.txt. Every product now has its own llms.txt (page index) and llms-full.txt (full page content) at its own URL. The root docs.mapbox.com/llms.txt is now a pure link index pointing to these sublevel files — it no longer contains any documentation content itself.

The previous resources all fetched the root llms.txt and filtered its sections by category keyword. With the new structure, those filtered sections contain only link-lists (links to other llms.txt files), not actual docs. The resources were effectively returning empty or useless content.

What changed

Resource Before After
resource://mapbox-api-reference Root llms.txt, filtered for "api" sections docs.mapbox.com/api/llms.txt — structured index of all REST APIs by service
resource://mapbox-guides Root llms.txt, filtered for "guide/studio/design" sections docs.mapbox.com/help/llms.txt — 39KB Help Center with actual guide content
resource://mapbox-sdk-docs Root llms.txt, filtered for "sdk/library" sections docs.mapbox.com/mapbox-gl-js/llms.txt — 34KB GL JS index (guides, API ref, examples)
resource://mapbox-reference Root llms.txt, filtered for "reference" sections Root llms.txt unfiltered — complete product catalog for discovery
resource://mapbox-examples Root llms.txt, filtered for playground/demo sections Same (root filtered for examples/playground/demo — still valid link lists)

Additional improvements

  • fetchCachedText(url, httpRequest) — new shared helper in docFetcher.ts that fetches and caches a URL. All five resource classes now use this instead of duplicating the fetch+cache pattern (~30 lines each).

  • toMarkdownUrl fix — previously returned a .txt.md URL for llms.txt URLs, causing get_document_tool to waste a request on a 404 before falling back. Now returns null for URLs already ending in .txt, .md, or .json, so they're fetched directly.

How to use llms-full.txt

Full page content for any product is available via get_document_tool:

  • https://docs.mapbox.com/style-spec/llms-full.txt — 466KB, full Style Spec
  • https://docs.mapbox.com/ios/navigation/llms-full.txt — 696KB, full Nav SDK iOS
  • https://docs.mapbox.com/mapbox-gl-js/llms-full.txt — 1.6MB, full GL JS docs

The root resource://mapbox-reference now surfaces the complete product catalog including all llms.txt URLs, so models can discover and request any product's full docs.

Test plan

  • npm test — 62 tests pass
  • CI green
  • Verify resource://mapbox-api-reference returns actual API endpoint list (not a link-index)
  • Verify resource://mapbox-guides returns Help Center content (not empty)

"Show me the Mapbox API Reference"
Screenshot 2026-04-14 at 14 21 03

"What Mapbox APls are available for navigation?"
Screenshot 2026-04-14 at 14 23 18

🤖 Generated with Claude Code

mattpodwysocki and others added 3 commits April 1, 2026 14:16
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docs.mapbox.com restructured so that llms.txt now exists at every product
level (e.g. /api/llms.txt, /help/llms.txt, /mapbox-gl-js/llms.txt) and
llms-full.txt contains full page content. The root llms.txt is now a pure
link index, so resources that filtered it by category keyword were returning
empty content.

- resource://mapbox-api-reference → docs.mapbox.com/api/llms.txt
  (structured index of all REST APIs by service category)
- resource://mapbox-guides → docs.mapbox.com/help/llms.txt
  (39KB Help Center index with actual guide content)
- resource://mapbox-sdk-docs → docs.mapbox.com/mapbox-gl-js/llms.txt
  (34KB GL JS documentation index with guides, API ref, examples)
- resource://mapbox-reference → root llms.txt without filtering
  (complete product catalog for discovering available docs)
- resource://mapbox-examples → continues filtering root for
  playground/demo sections

Add fetchCachedText() shared helper to docFetcher to consolidate the
repeated fetch+cache pattern across all resource implementations.

Fix toMarkdownUrl() to return null for URLs already ending in .txt/.md/.json
so get_document_tool fetches llms.txt files directly without a wasted
.md rewrite attempt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mattpodwysocki and others added 3 commits April 14, 2026 13:52
cspell flagged capitalized 'Isochrone' as an unknown word in two
resource description strings. Lowercased to fix CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cspell doesn't know 'isochrone' by default. Adding both cases to the
project word list so the CI spellcheck passes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New tool that fetches the llms.txt documentation index for any Mapbox
product — model can autonomously pull the right index without the user
manually attaching a resource. Supports 13 product keys, results cached
via docCache.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mattpodwysocki and others added 2 commits April 14, 2026 14:17
The custom 'Accept: text/markdown, text/plain;q=0.9, */*;q=0.8' header
was triggering a 403 Forbidden from the docs.mapbox.com CDN. Removing it
(and the equivalent in fetchDocContent's fallback path) fixes the issue —
the CDN serves the file correctly without an explicit Accept header.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: add get_mapbox_docs_index_tool
Copy link
Copy Markdown
Member

@zmofei zmofei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean adaptation to the upstream llms.txt restructure.

Two minor observations (neither blocking):

  1. fetchDocContent Accept header removal (src/utils/docFetcher.ts:106) — the Accept: text/markdown header was also removed from the general get_document_tool fallback path, not just the new fetchCachedText. Low risk since the .md variant is tried first, but it's a slightly broader change than the PR scope suggests.

  2. toMarkdownUrl fix is a nice catch — preventing the llms.txt.md 404 → fallback double-fetch is a real performance win for all .txt/.md/.json URLs going through get_document_tool.

New GetDocsIndexTool is well-designed with proper annotations and typed product enum. Good test coverage across happy path, caching, errors, and invalid input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants