Skip to content

fix(file): convert unsupported image formats to PNG in ReadMediaFile#2382

Open
Pluviobyte wants to merge 1 commit into
MoonshotAI:mainfrom
Pluviobyte:fix/read-media-ico-conversion
Open

fix(file): convert unsupported image formats to PNG in ReadMediaFile#2382
Pluviobyte wants to merge 1 commit into
MoonshotAI:mainfrom
Pluviobyte:fix/read-media-ico-conversion

Conversation

@Pluviobyte
Copy link
Copy Markdown

@Pluviobyte Pluviobyte commented May 28, 2026

Related Issue

Resolve #2017

Description

Kimi (and the other providers in this repo's kosong layer — Anthropic, Google) accept only image/png, image/jpeg, image/gif, and image/webp for image input. When the agent calls ReadMediaFile on a .ico file the resulting data:image/x-icon;base64,... URL goes straight into session history. The next provider request then crashes with 400 unsupported image format: image/x-icon, and because the offending turn is already persisted in context.jsonl, every subsequent resume of that session hits the same error — the conversation becomes permanently unrecoverable (as described in #2017).

Fix

  • Add _normalize_image_for_provider() in src/kimi_cli/tools/file/read_media.py that decodes any image whose MIME is outside the supported set and re-encodes it as PNG before the data: URL is constructed.
  • Falls back to the original bytes (with logger.warning) when Pillow can't decode the file, so genuinely corrupt input keeps its previous failure mode instead of turning into a tool error.
  • The tool message still reports the file's original MIME so the user/model sees what was actually loaded.

Why at the ReadMediaFile boundary

The corruption is persisted, not transient. Normalizing at the read boundary means the recorded ImageURLPart in context.jsonl is already model-compatible, so the very next provider call — and every resume thereafter — sees a well-formed image part.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked the related issue.
  • I have added tests that prove the fix is effective.
  • I have updated CHANGELOG.md (manual Unreleased entry, following the existing one-line-per-bullet style; make gen-changelog not run because the change is mechanical).
  • make gen-docs not run — N/A: no user-facing config / CLI / ReadMediaFile description change.

Test plan

  • uv run ruff check src/kimi_cli/tools/file/read_media.py tests/tools/test_read_media_file.py → clean
  • uv run ruff format --check src/kimi_cli/tools/file/read_media.py tests/tools/test_read_media_file.py → clean
  • uv run pyright src/kimi_cli/tools/file/read_media.py tests/tools/test_read_media_file.py0 errors, 0 warnings, 0 informations
  • uv run pytest tests/tools/test_read_media_file.py -vv7 passed (1 new: test_read_ico_file_converts_to_png)

Proof of fix

Pre-patch behaviour, on current main:

>>> from kimi_cli.tools.file.utils import detect_file_type
>>> detect_file_type('favicon.ico', header=b'\x00\x00\x01\x00' + b'\x00' * 32)
FileType(kind='image', mime_type='image/x-icon')

→ the data URL data:image/x-icon;base64,... is written into history; the model gateway then rejects every following request.

Post-patch: test_read_ico_file_converts_to_png constructs a real .ico via Pillow, calls ReadMediaFile, and asserts the resulting ImageURLPart URL starts with data:image/png;base64,. ✅

Made with Cursor


Open in Devin Review

Kimi (and Anthropic/Google) image input only accepts image/png,
image/jpeg, image/gif, and image/webp. When the agent calls
ReadMediaFile on a .ico file (mime image/x-icon), the resulting data URL
was written straight into session history. The next provider request
then crashed with `400 unsupported image format: image/x-icon`, and
because the offending turn was already persisted, every subsequent
resume hit the same error -- the conversation could never be continued.

Re-encode any non-supported image format to PNG at the ReadMediaFile
boundary so the persisted ImageURLPart is always model-compatible.
Falls back to the original bytes when Pillow cannot decode the file, so
genuine read failures keep their previous behaviour instead of turning
into tool errors.

Fixes MoonshotAI#2017

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

对话无法继续,之前有很多上下文内容 || The conversation cannot continue, there is a lot of context before it

1 participant