feat(aws): add auto language detection and mid-stream language switch…#5435
Open
cldsime wants to merge 1 commit intolivekit:mainfrom
Open
feat(aws): add auto language detection and mid-stream language switch…#5435cldsime wants to merge 1 commit intolivekit:mainfrom
cldsime wants to merge 1 commit intolivekit:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
🔴 LanguageCode(None) crash when language identification is enabled and resp.language_code is missing
When identify_language or identify_multiple_languages is True, self._opts.language is set to None (line 142-144 in the constructor). In _streaming_recognize_response_to_speech_data, line 431 uses LanguageCode(resp.language_code or self._opts.language). If resp.language_code is also None or empty (which can happen for partial results), the or expression evaluates to None, and LanguageCode(None) is called. This crashes in _normalize_language (language.py:37) with AttributeError: 'NoneType' object has no attribute 'strip'.
(Refers to line 431)
Was this helpful? React with 👍 or 👎 to provide feedback.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds support for Amazon Transcribe's automatic language identification parameters to the
livekit-plugins-awsSTT plugin, enabling auto language detection and mid-stream language switching without requiring users to manually specify alanguage_code.Changes
6 new parameters added to
STTOptionsandSTT.__init__():identify_language— detect the dominant language for the streamidentify_multiple_languages— detect language switches mid-streamlanguage_options— comma-separated list of expected language codes (2–12 required)preferred_language— bias detection toward a specific languagevocabulary_names— custom vocabularies per languagevocabulary_filter_names— vocabulary filters per languageAll default to disabled (
False/NOT_GIVEN), preserving full backward compatibility.Conditional config building in
SpeechStream._run():language_codeandidentify_language/identify_multiple_languagesare mutually exclusive per the AWS API. The config builder now conditionally sends one or the other:Bug fix:
filtered_configboolean handlingThe original filter
{k: v for k, v in live_config.items() if v and is_given(v)}silently dropsFalsebooleans. Replaced with explicit type checking that correctly preserves booleans, numbers, andNOT_GIVENvalues.Usage
Test Results
Tested with
identify_multiple_languages=Trueand 12 language codes. All 12 configured languages were successfully detected across test sessions with mid-stream switching:Sample output showing mid-stream language switching in a single session:
Backward Compatibility
All new parameters default to disabled. Existing code continues to work without any changes.