Description
When the Slack handler fetches a file URL from files.slack.com, it does not validate the response content-type or magic bytes before forwarding to the model API. We observed an image/png-labeled payload sent to Anthropic whose decoded bytes were Slack workspace login HTML (~55KB), consistent with the bot token lacking the files:read OAuth scope so Slack served the login page in place of the file binary. openab base64-encoded those HTML bytes, labeled them with the Slack-reported MIME, and forwarded to Anthropic, which rejected with 400 invalid_request_error "Could not process image".
Two failure modes:
- Confusing UX: the user sees an Anthropic 400 that looks like an Anthropic-side image format problem, not a Slack auth/scope problem.
- Session poisoning: the bad payload is persisted into the claude-agent-acp session JSONL on PVC. Subsequent messages in the same Slack thread resume the session and replay the bad image block as part of conversation history, so Anthropic re-rejects on every turn until the JSONL is manually deleted.
Reproduction
-
Configure a Slack app for openab without files:read scope. Verify with:
kubectl exec deployment/openab-claude -- sh -c \
'curl -sS -D - -o /dev/null -X POST https://slack.com/api/auth.test \
-H "Authorization: Bearer $SLACK_BOT_TOKEN" | grep -i "^x-oauth-scopes:"'
The response header should not contain a comma-separated files:read token.
-
As a Slack user, upload any image to the bot in a thread.
-
Bot reply:
:warning: Internal Error (code: -32603)
Internal error: API Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"Could not process image"},"request_id":"req_011..."}
-
Add files:read scope to the Slack app, reinstall, rotate the token, redeploy, and verify the new scope is in x-oauth-scopes from step 1. Now upload another image to the same Slack thread: same 400.
-
Recover by deleting the session JSONL inside the pod:
kubectl exec deployment/openab-claude -- bash -c \
'grep -lE "Could not process image" /home/node/.claude/projects/-home-node/*.jsonl | xargs rm -v'
Expected Behavior
When openab fetches a file URL from Slack and the response is not a valid image, it should:
- Detect the failure synchronously, by inspecting the HTTP
Content-Type header and/or the response body's magic bytes, before base64-encoding for the model API.
- Fail fast with a user-actionable error, e.g. "I couldn't access that image — make sure the bot has the
files:read OAuth scope." Don't forward unverified bytes to the model.
- Not persist the unverified payload to claude session history, so that recovering from misconfiguration is "fix the config" without an out-of-band PVC cleanup.
End-to-end success criterion: after the operator adds files:read and rotates the token, uploading an image in the same Slack thread succeeds on the next message, with no manual session deletion.
Actual Behavior
- openab fetches
files.slack.com/... with the bot token. We did not capture the raw HTTP response in our logs; based on the bytes that ended up in the conversation block we infer Slack returned the workspace login HTML page in place of the file binary.
- openab does not visibly validate Content-Type or magic bytes. It reads the body, base64-encodes it, and labels it with the MIME from the Slack file metadata (
image/png in our case).
- openab forwards to Anthropic; Anthropic decodes, sees magic
3c 21 44 4f 43 54 59 50 (<!DOCTYP) instead of PNG 89 50 4e 47 0d 0a 1a 0a, and returns 400.
- claude-agent-acp persists the user message (containing the bad image block) into
~/.claude/projects/-home-node/<session>.jsonl regardless of API success.
- Subsequent messages in the same Slack thread resume that session and resend the bad image block as part of history. Anthropic 400s again.
Evidence
A. The forwarded payload was HTML, not PNG
Decoded base64 of an image/png payload sent to Anthropic from a poisoned session in our cluster (Anthropic request_id req_011CarWAQQ11f3C1jzLbPkgP):
| field |
value |
| claimed media_type |
image/png |
| base64 length |
74,124 chars |
| decoded bytes |
55,592 |
| magic bytes (first 8) |
3c 21 44 4f 43 54 59 50 (<!DOCTYP) |
| last 120 bytes |
...slack-www-hhvm-main-iad-ya7nx4dnuum1/ 2026-05-08 21:55:25/...</body></html> |
PNG magic should be 89 50 4e 47 0d 0a 1a 0a. The bytes are HTML; the last-byte tail matches Slack's web-page footer. We did not capture the raw fetch response, so the "Slack returned HTML" link is inferred — sanitized fetch logs from anyone who can reproduce in their cluster would tighten this.
B. The payload was replayed from session history on subsequent turns
Same session ID 774ec817-7ace-4e47-81a9-c875962ef720 (claude-agent-acp session JSONL on PVC):
- Line 6 (06:14 UTC, original incident):
user message with an image block whose base64 decodes to the HTML above. Anthropic 400 (request_id req_011CarWAQQ11f3C1jzLbPkgP).
- Line 54 (~10 hours later, 08:04 UTC): user message in the same Slack thread containing only text ("你可看到上面的圖片了嗎?") and
sender_context — no image block. Anthropic 400 again (request_id req_011Carea29tJDW...).
- Line 58 (08:06 UTC): same pattern, text-only user message ("can you read the image ?"). Anthropic 400 (request_id
req_011CarefyzaStqh7dZPAd3j5).
- Line 60 immediately after: synthetic assistant message recording the 400.
isApiErrorMessage: true, apiErrorStatus: 400, model <synthetic>.
i.e. text-only user turns failed with the same Could not process image error, which is consistent with the bad image block at line 6 being replayed from JSONL on every turn. After we deleted that JSONL on the PVC, a fresh thread/session in the same channel was unblocked.
Suggested Fix
Two layers of defense, both worth doing:
1. Validate before forwarding (primary)
After fetching from Slack, before base64-encoding for the model API:
- Check the HTTP
Content-Type header from the Slack response. Require the type to be in the model's supported set (for Anthropic Vision: image/png, image/jpeg, image/gif, image/webp). Generic image/* is too broad — image/svg+xml, image/heic, image/avif, image/tiff will be rejected by the model anyway.
- Check the response body's magic bytes. Whitelist the full signatures, not just the first nibble:
- PNG:
89 50 4e 47 0d 0a 1a 0a
- JPEG:
ff d8 ff (followed by e0/e1/e2/...)
- GIF:
47 49 46 38 37 61 or 47 49 46 38 39 61
- WebP:
52 49 46 46 ?? ?? ?? ?? 57 45 42 50
- If either check fails, surface a specific error to the user (e.g. "I couldn't access that image — does the bot have the
files:read OAuth scope?") and skip the model call.
2. Don't persist unverified bytes to session history (secondary)
Persist the user-turn image block to the claude session JSONL only after the model call returns 200, not before. This way a 4xx from the model leaves the session in the same shape it was before the failed turn, and the operator's normal "fix config and try again" path works without manual JSONL cleanup.
This second defense matters even if (1) is in place: a real-but-corrupted upload can pass magic-byte check and still be rejected by the model, and once the bad block is in history the thread continues failing.
Severity / Impact
- Severity: high for any operator who hits the misconfiguration. The user-facing error blames Anthropic; root cause is a Slack scope; once a thread is poisoned, every reply in it is broken; cluster operator intervention (pod exec, file deletion) is required to recover.
- Surface: any openab Slack ingest path where the bot token may not have
files:read (which is not enforced or warned about by the chart's required-scopes section in the helm install NOTES — chart 0.8.2 NOTES does list files:read as required, but operators commonly miss it because pod startup succeeds and Slack returns 200 instead of 401 on the file fetch).
- Discord ingest: not tested — Discord has different auth/file URL semantics, but the same "fetched bytes forwarded without validation" code path may apply if shared.
Regression / version scope
- Reproduced on chart 0.8.2 with image
ghcr.io/openabdev/openab-claude:latest (digest sha256:3d2017efbf1ab9702a9e3e7eaccb50e1147488cad04517ba50f29f63fd07d0ab at the time of repro).
- Earlier versions not tested. Whether this is a recent regression or always present is unknown to us; if maintainers know when image ingest was introduced, that bounds the regression range.
Workaround state
Operators can:
- Add
files:read to all bots' Slack apps and reinstall / rotate tokens (prevents new failures).
- Delete poisoned session JSONLs on the PVC (recovers existing broken threads). For us,
grep -lE "Could not process image" /home/node/.claude/projects/-home-node/*.jsonl | xargs rm -v was sufficient — we did not need to touch thread_map.json or .claude/sessions/.
We have documented the operator-side runbook at https://github.com/heyu-ai/openab-workspace/blob/main/docs/runbook.md (search "Slack 整合問題"). It covers diagnosis, remediation, and the PVC-cleanup recovery, but it is operator workaround, not an upstream fix.
Environment
- openab chart 0.8.2
- Image digests at time of repro:
- claude:
ghcr.io/openabdev/openab-claude@sha256:3d2017efbf1ab9702a9e3e7eaccb50e1147488cad04517ba50f29f63fd07d0ab
- codex:
ghcr.io/openabdev/openab-codex@sha256:4cc5f6fcf6983f57cdcd29560f96ab0d4b144c4fd4dc96d65a04f229f865a2f0
- gemini:
ghcr.io/openabdev/openab-gemini@sha256:f157b30aecfba8fac873a04087e836ba45b5ac164ca09f58ac9959c716cb8bd8
- Three agents enabled (claude / codex / gemini); all three Slack apps had identical OAuth scope sets missing
files:read before remediation
- Cluster: OrbStack-built-in K8s, single-tenant dev environment
Description
When the Slack handler fetches a file URL from
files.slack.com, it does not validate the response content-type or magic bytes before forwarding to the model API. We observed animage/png-labeled payload sent to Anthropic whose decoded bytes were Slack workspace login HTML (~55KB), consistent with the bot token lacking thefiles:readOAuth scope so Slack served the login page in place of the file binary. openab base64-encoded those HTML bytes, labeled them with the Slack-reported MIME, and forwarded to Anthropic, which rejected with400 invalid_request_error "Could not process image".Two failure modes:
Reproduction
Configure a Slack app for openab without
files:readscope. Verify with:The response header should not contain a comma-separated
files:readtoken.As a Slack user, upload any image to the bot in a thread.
Bot reply:
Add
files:readscope to the Slack app, reinstall, rotate the token, redeploy, and verify the new scope is inx-oauth-scopesfrom step 1. Now upload another image to the same Slack thread: same 400.Recover by deleting the session JSONL inside the pod:
Expected Behavior
When openab fetches a file URL from Slack and the response is not a valid image, it should:
Content-Typeheader and/or the response body's magic bytes, before base64-encoding for the model API.files:readOAuth scope." Don't forward unverified bytes to the model.End-to-end success criterion: after the operator adds
files:readand rotates the token, uploading an image in the same Slack thread succeeds on the next message, with no manual session deletion.Actual Behavior
files.slack.com/...with the bot token. We did not capture the raw HTTP response in our logs; based on the bytes that ended up in the conversation block we infer Slack returned the workspace login HTML page in place of the file binary.image/pngin our case).3c 21 44 4f 43 54 59 50(<!DOCTYP) instead of PNG89 50 4e 47 0d 0a 1a 0a, and returns 400.~/.claude/projects/-home-node/<session>.jsonlregardless of API success.Evidence
A. The forwarded payload was HTML, not PNG
Decoded base64 of an
image/pngpayload sent to Anthropic from a poisoned session in our cluster (Anthropic request_idreq_011CarWAQQ11f3C1jzLbPkgP):image/png3c 21 44 4f 43 54 59 50(<!DOCTYP)...slack-www-hhvm-main-iad-ya7nx4dnuum1/ 2026-05-08 21:55:25/...</body></html>PNG magic should be
89 50 4e 47 0d 0a 1a 0a. The bytes are HTML; the last-byte tail matches Slack's web-page footer. We did not capture the raw fetch response, so the "Slack returned HTML" link is inferred — sanitized fetch logs from anyone who can reproduce in their cluster would tighten this.B. The payload was replayed from session history on subsequent turns
Same session ID
774ec817-7ace-4e47-81a9-c875962ef720(claude-agent-acp session JSONL on PVC):usermessage with animageblock whose base64 decodes to the HTML above. Anthropic 400 (request_idreq_011CarWAQQ11f3C1jzLbPkgP).sender_context— no image block. Anthropic 400 again (request_idreq_011Carea29tJDW...).req_011CarefyzaStqh7dZPAd3j5).isApiErrorMessage: true,apiErrorStatus: 400, model<synthetic>.i.e. text-only user turns failed with the same
Could not process imageerror, which is consistent with the bad image block at line 6 being replayed from JSONL on every turn. After we deleted that JSONL on the PVC, a fresh thread/session in the same channel was unblocked.Suggested Fix
Two layers of defense, both worth doing:
1. Validate before forwarding (primary)
After fetching from Slack, before base64-encoding for the model API:
Content-Typeheader from the Slack response. Require the type to be in the model's supported set (for Anthropic Vision:image/png,image/jpeg,image/gif,image/webp). Genericimage/*is too broad —image/svg+xml,image/heic,image/avif,image/tiffwill be rejected by the model anyway.89 50 4e 47 0d 0a 1a 0aff d8 ff(followed bye0/e1/e2/...)47 49 46 38 37 61or47 49 46 38 39 6152 49 46 46 ?? ?? ?? ?? 57 45 42 50files:readOAuth scope?") and skip the model call.2. Don't persist unverified bytes to session history (secondary)
Persist the user-turn image block to the claude session JSONL only after the model call returns 200, not before. This way a 4xx from the model leaves the session in the same shape it was before the failed turn, and the operator's normal "fix config and try again" path works without manual JSONL cleanup.
This second defense matters even if (1) is in place: a real-but-corrupted upload can pass magic-byte check and still be rejected by the model, and once the bad block is in history the thread continues failing.
Severity / Impact
files:read(which is not enforced or warned about by the chart's required-scopes section in thehelm installNOTES — chart 0.8.2 NOTES does listfiles:readas required, but operators commonly miss it because pod startup succeeds and Slack returns 200 instead of 401 on the file fetch).Regression / version scope
ghcr.io/openabdev/openab-claude:latest(digestsha256:3d2017efbf1ab9702a9e3e7eaccb50e1147488cad04517ba50f29f63fd07d0abat the time of repro).Workaround state
Operators can:
files:readto all bots' Slack apps and reinstall / rotate tokens (prevents new failures).grep -lE "Could not process image" /home/node/.claude/projects/-home-node/*.jsonl | xargs rm -vwas sufficient — we did not need to touchthread_map.jsonor.claude/sessions/.We have documented the operator-side runbook at https://github.com/heyu-ai/openab-workspace/blob/main/docs/runbook.md (search "Slack 整合問題"). It covers diagnosis, remediation, and the PVC-cleanup recovery, but it is operator workaround, not an upstream fix.
Environment
ghcr.io/openabdev/openab-claude@sha256:3d2017efbf1ab9702a9e3e7eaccb50e1147488cad04517ba50f29f63fd07d0abghcr.io/openabdev/openab-codex@sha256:4cc5f6fcf6983f57cdcd29560f96ab0d4b144c4fd4dc96d65a04f229f865a2f0ghcr.io/openabdev/openab-gemini@sha256:f157b30aecfba8fac873a04087e836ba45b5ac164ca09f58ac9959c716cb8bd8files:readbefore remediation