fix(chromium): preserve binary response body with text content-type#40795
fix(chromium): preserve binary response body with text content-type#40795SebTardif wants to merge 1 commit into
Conversation
When a server responds with Content-Type: text/plain;charset=UTF-8 but the body contains binary data, CDP's Network.getResponseBody decodes it as UTF-8 (base64Encoded: false), replacing invalid byte sequences with U+FFFD. Detect this corruption by comparing Content-Length with the decoded buffer's byte length, and re-fetch via loadNetworkResource which preserves binary fidelity through IO.read streams. Fixes: microsoft#40510
Test results for "MCP"10 failed 1 flaky7059 passed, 1068 skipped Merge workflow run. |
Test results for "tests 1"11 failed 3 flaky41753 passed, 850 skipped Merge workflow run. |
Summary
loadNetworkResource+IO.readstream path which preserves binary fidelityWhen a server responds with
Content-Type: text/plain;charset=UTF-8but the body is binary, CDP'sNetwork.getResponseBodydecodes it as UTF-8 (base64Encoded: false), replacing invalid byte sequences with U+FFFD. For example, 8 bytes[0x80, 0x81, 0x82, 0xFF, 0xFE, 0x00, 0x01, 0x02]become 18 bytes after each invalid byte is replaced with a 3-byte replacement character.The workaround detects this by comparing
Content-Lengthwith the decoded buffer's byte length. On mismatch, it re-fetches from cache viaNetwork.loadNetworkResource, which returns raw bytes throughIO.readstreams without UTF-8 interpretation.Chromium-only fix. WebKit has the same CDP-level behavior but no
loadNetworkResourceequivalent. Firefox is unaffected (always returns base64).Prior art
loadNetworkResourcefallback path was originally added for prefetch scripts that return empty bodies. This PR extends the same mechanism to handle binary body corruption.Fixes #40510