Skip to content

fix(diagnostics): classify HTTP timeouts and string-wrapped 5xx as known codes#469

Open
Dumbris wants to merge 1 commit into
mainfrom
fix/classifier-http-timeout
Open

fix(diagnostics): classify HTTP timeouts and string-wrapped 5xx as known codes#469
Dumbris wants to merge 1 commit into
mainfrom
fix/classifier-http-timeout

Conversation

@Dumbris
Copy link
Copy Markdown
Member

@Dumbris Dumbris commented May 15, 2026

Summary

Investigating why disabling tools in the huggingface server surfaced a scary "Server Error / MCPX_UNKNOWN_UNCLASSIFIED" alert, I found the toggle itself works (200 OK), but the upstream's HTTP transport occasionally times out against hf.co/mcp. When that happens during the post-toggle re-fetch, the error reached the user wrapped as plain text — the classifier's typed paths (errors.Is, *statusError) never fired, so it fell through to UNCLASSIFIED.

This PR adds two recovery rules in classifyHTTP:

  • context.DeadlineExceeded on http transport → new MCPX_HTTP_TIMEOUT (severity: warn — transient by nature)
  • substring "context deadline exceeded" on http transport → same code (catches the string-wrapped form the upstream layer emits)
  • "request failed with status NNN" / "notification failed with status NNN" → routed through the existing DiagnoseHTTPStatus(int) so 401/403/404/5xx all get their typed codes (no more "UNCLASSIFIED" for plain-string 504s)

Tests are taken verbatim from ~/Library/Logs/mcpproxy/server-hugginface.log so the rules cover the exact wire shape the field produces.

Why this matters

MCPX_UNKNOWN_UNCLASSIFIED renders as red "Server Error" with no remediation. MCPX_HTTP_TIMEOUT and MCPX_HTTP_5XX already have catalog entries with calmer messaging ("usually transient", "upstream status page link"), and HTTPTimeout is SeverityWarn so the UI can choose a yellow/degraded badge instead of a red error.

This is half of the user-visible fix discussed in the investigation thread. The other half — don't flap the server to Error state on a single 5-second health-check miss — is being addressed in a separate PR.

Test plan

  • go test ./internal/diagnostics/ -race — all green, including the three new cases (TestClassify_HTTP_Timeout, TestClassify_HTTP_TimeoutStringWrapped, TestClassify_HTTP_StatusFromText)
  • go build ./... clean
  • go test ./internal/runtime/supervisor/ -race — green (no churn on attach path)

🤖 Generated with Claude Code

…own codes

The HTTP transport adapter wraps context.DeadlineExceeded and non-2xx
responses as plain "transport error: ..." strings before bubbling them up.
The typed errors.Is / statusError paths the classifier relied on never
fire for those, so every hf.co timeout or 504 surfaced to the UI as
MCPX_UNKNOWN_UNCLASSIFIED ("Server Error") even though the cause was a
well-known transient upstream condition.

Add two recovery rules in classifyHTTP:
- context.DeadlineExceeded on http transport -> new MCPX_HTTP_TIMEOUT
  (severity: warn — usually transient, not a config bug)
- substring "context deadline exceeded" on http transport -> same
- "request failed with status NNN" / "notification failed with status NNN"
  -> route through DiagnoseHTTPStatus so 401/403/404/5xx all get their
  existing typed codes

Tests cover both the typed and stringified forms taken verbatim from
production server-hugginface.log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: fc13dfe
Status: ✅  Deploy successful!
Preview URL: https://e416ae27.mcpproxy-docs.pages.dev
Branch Preview URL: https://fix-classifier-http-timeout.mcpproxy-docs.pages.dev

View logs

@github-actions
Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: fix/classifier-http-timeout

Available Artifacts

  • archive-darwin-amd64 (26 MB)
  • archive-darwin-arm64 (23 MB)
  • archive-linux-amd64 (15 MB)
  • archive-linux-arm64 (13 MB)
  • archive-windows-amd64 (26 MB)
  • archive-windows-arm64 (23 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (20 MB)
  • installer-dmg-darwin-arm64 (18 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 25906318932 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants