[inworld] use audio/pcm to leverage the fast AudioBytesStream path by ianbbqzy · Pull Request #4803 · livekit/agents

ianbbqzy · 2026-02-12T19:05:45Z

strip wav header manually before pushing to emitter

CLAassistant · 2026-02-12T19:05:57Z

All committers have signed the CLA.

devin-ai-integration

Devin Review found 1 potential issue.

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

ianbbqzy · 2026-02-12T19:47:15Z

cc @theomonnom @longcw. Thanks!

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

davidzhao · 2026-02-13T05:14:26Z

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

+            # AudioByteStream path instead of the async AudioStreamDecoder.
+            # WAV headers from the server are stripped before pushing to the
+            # emitter (see _strip_wav_header).
+            return "audio/pcm"


if it's wav file sent back.. you should just pass back audio/wav, our decoder has a fast path for decoding wav files.

this would be preferred rather than having multiple wav handling in the code base

I believe that's the existing behavior. When encoding is LINEAR16, it falls to the else branch of audio/wav. But I measured that to have ~30-40ms additional latency.

@tinalenguyen would you be able to advise here?

I tested with my own websocket benchmark script at https://github.com/inworld-ai/inworld-api-examples/tree/ian/livekit-integrations/integrations/livekit/python/benchmarks with 100+ iterations

summarized by AI, the reason is that

audio/pcm (raw) audio/wav (decoder)

Threading None — all in main loop ThreadPoolExecutor + StreamBuffer with locks

Cross-thread hops 0 2 (main→thread via StreamBuffer, thread→main via call_soon_threadsafe)

Event loop cycles to first frame 1 3+ (push → thread wake → thread decode → call_soon_threadsafe → decode_task scheduled → decode_task runs)

AudioByteStream instances 1 2 (one in _decode_wav_loop, one in _decode_task)

Buffer copies per read 0 StreamBuffer.read() creates new BytesIO + copies remaining bytes every read

When encoding is LINEAR16, it falls to the else branch of audio/wav. But I measured that to have ~30-40ms additional latency.

could you share the benchmark scripts that you had ran?

if AudioStreamDecoder is slow, we should optimize that instead. I still maintain that we should not be duplicating decoding logic within plugin code

Hi David, can you access this script https://github.com/inworld-ai/inworld-api-examples/tree/ian/livekit-integrations/integrations/livekit/python/benchmarks ? I have the instructions to run in the README, you should be able to checkout this feature branch in the submodule to see the difference.

We might also consider a new Audio encoding format PCM for headerless audio bytes to differentiate from the current encoding format on the server side

…vekit#4784)

devin-ai-integration bot reviewed Feb 12, 2026

View reviewed changes

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py Show resolved Hide resolved

ianbbqzy force-pushed the ian/inworld-pcm branch from f279099 to 43325a9 Compare February 12, 2026 19:46

davidzhao reviewed Feb 13, 2026

View reviewed changes

[inworld] add User-Agent and X-Request-Id for better traceability (li…

74a4f88

…vekit#4784)

ianbbqzy force-pushed the ian/inworld-pcm branch from 43325a9 to 74a4f88 Compare February 13, 2026 22:55

ianbbqzy requested a review from davidzhao February 17, 2026 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inworld] use audio/pcm to leverage the fast AudioBytesStream path#4803

[inworld] use audio/pcm to leverage the fast AudioBytesStream path#4803
ianbbqzy wants to merge 1 commit intolivekit:mainfrom
ianbbqzy:ian/inworld-pcm

ianbbqzy commented Feb 12, 2026

Uh oh!

CLAassistant commented Feb 12, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Uh oh!

ianbbqzy commented Feb 12, 2026

Uh oh!

Uh oh!

davidzhao Feb 13, 2026

Uh oh!

ianbbqzy Feb 13, 2026

Uh oh!

ianbbqzy Feb 13, 2026 •

edited

Loading

Uh oh!

davidzhao Feb 17, 2026

Uh oh!

ianbbqzy Feb 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

	`audio/pcm` (raw)	`audio/wav` (decoder)
Threading	None — all in main loop	`ThreadPoolExecutor` + `StreamBuffer` with locks
Cross-thread hops	0	2 (main→thread via StreamBuffer, thread→main via `call_soon_threadsafe`)
Event loop cycles to first frame	1	3+ (push → thread wake → thread decode → call_soon_threadsafe → decode_task scheduled → decode_task runs)
AudioByteStream instances	1	2 (one in `_decode_wav_loop`, one in `_decode_task`)
Buffer copies per read	0	`StreamBuffer.read()` creates new `BytesIO` + copies remaining bytes every read

Conversation

ianbbqzy commented Feb 12, 2026

Uh oh!

CLAassistant commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ianbbqzy commented Feb 12, 2026

Uh oh!

Uh oh!

davidzhao Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

ianbbqzy Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

ianbbqzy Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidzhao Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

ianbbqzy Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

CLAassistant commented Feb 12, 2026 •

edited

Loading

ianbbqzy Feb 13, 2026 •

edited

Loading

ianbbqzy Feb 17, 2026 •

edited

Loading