Skip to content

Handle invalid UTF-8 in whisper transcription responses#22

Merged
74th merged 1 commit intomainfrom
fix/whisper-cpp
Mar 14, 2026
Merged

Handle invalid UTF-8 in whisper transcription responses#22
74th merged 1 commit intomainfrom
fix/whisper-cpp

Conversation

@74th
Copy link
Owner

@74th 74th commented Mar 14, 2026

Summary

  • decode whisper.cpp and whisper-server JSON responses with UTF-8 replacement so malformed byte sequences do not abort transcription loading
  • emit warnings when replacement characters are introduced while parsing transcript JSON
  • add unit tests covering malformed UTF-8 handling for both whisper.cpp and whisper-server paths

Testing

  • python -m unittest discover -s tests -p 'test_whisper_server.py'
  • make lint

@74th 74th merged commit c6bb024 into main Mar 14, 2026
1 check passed
@74th 74th deleted the fix/whisper-cpp branch March 14, 2026 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant