fix(diagnostics): classify HTTP timeouts and string-wrapped 5xx as known codes#469
Open
Dumbris wants to merge 1 commit into
Open
fix(diagnostics): classify HTTP timeouts and string-wrapped 5xx as known codes#469Dumbris wants to merge 1 commit into
Dumbris wants to merge 1 commit into
Conversation
…own codes
The HTTP transport adapter wraps context.DeadlineExceeded and non-2xx
responses as plain "transport error: ..." strings before bubbling them up.
The typed errors.Is / statusError paths the classifier relied on never
fire for those, so every hf.co timeout or 504 surfaced to the UI as
MCPX_UNKNOWN_UNCLASSIFIED ("Server Error") even though the cause was a
well-known transient upstream condition.
Add two recovery rules in classifyHTTP:
- context.DeadlineExceeded on http transport -> new MCPX_HTTP_TIMEOUT
(severity: warn — usually transient, not a config bug)
- substring "context deadline exceeded" on http transport -> same
- "request failed with status NNN" / "notification failed with status NNN"
-> route through DiagnoseHTTPStatus so 401/403/404/5xx all get their
existing typed codes
Tests cover both the typed and stringified forms taken verbatim from
production server-hugginface.log.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deploying mcpproxy-docs with
|
| Latest commit: |
fc13dfe
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://e416ae27.mcpproxy-docs.pages.dev |
| Branch Preview URL: | https://fix-classifier-http-timeout.mcpproxy-docs.pages.dev |
2 tasks
📦 Build ArtifactsWorkflow Run: View Run Available Artifacts
How to DownloadOption 1: GitHub Web UI (easiest)
Option 2: GitHub CLI gh run download 25906318932 --repo smart-mcp-proxy/mcpproxy-go
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Investigating why disabling tools in the huggingface server surfaced a scary "Server Error / MCPX_UNKNOWN_UNCLASSIFIED" alert, I found the toggle itself works (200 OK), but the upstream's HTTP transport occasionally times out against hf.co/mcp. When that happens during the post-toggle re-fetch, the error reached the user wrapped as plain text — the classifier's typed paths (
errors.Is,*statusError) never fired, so it fell through toUNCLASSIFIED.This PR adds two recovery rules in
classifyHTTP:context.DeadlineExceededon http transport → newMCPX_HTTP_TIMEOUT(severity: warn — transient by nature)"context deadline exceeded"on http transport → same code (catches the string-wrapped form the upstream layer emits)"request failed with status NNN"/"notification failed with status NNN"→ routed through the existingDiagnoseHTTPStatus(int)so 401/403/404/5xx all get their typed codes (no more "UNCLASSIFIED" for plain-string 504s)Tests are taken verbatim from
~/Library/Logs/mcpproxy/server-hugginface.logso the rules cover the exact wire shape the field produces.Why this matters
MCPX_UNKNOWN_UNCLASSIFIEDrenders as red "Server Error" with no remediation.MCPX_HTTP_TIMEOUTandMCPX_HTTP_5XXalready have catalog entries with calmer messaging ("usually transient", "upstream status page link"), andHTTPTimeoutisSeverityWarnso the UI can choose a yellow/degraded badge instead of a red error.This is half of the user-visible fix discussed in the investigation thread. The other half — don't flap the server to
Errorstate on a single 5-second health-check miss — is being addressed in a separate PR.Test plan
go test ./internal/diagnostics/ -race— all green, including the three new cases (TestClassify_HTTP_Timeout,TestClassify_HTTP_TimeoutStringWrapped,TestClassify_HTTP_StatusFromText)go build ./...cleango test ./internal/runtime/supervisor/ -race— green (no churn on attach path)🤖 Generated with Claude Code