fix(lemonslice): bind avatar audio output before session start#1593
fix(lemonslice): bind avatar audio output before session start#1593rosetta-livekit-bot[bot] wants to merge 1 commit into
Conversation
🦋 Changeset detectedLatest commit: 2c03b52 The changes in this PR will be included in the next version bump. This PR includes changesets to release 33 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| waitPlaybackStart: true, | ||
| }); | ||
|
|
||
| this.#logger.debug('starting avatar session'); | ||
| const sessionId = await this.startAgent(livekitUrl, livekitToken); |
There was a problem hiding this comment.
🔴 Audio output left in broken state if startAgent HTTP call fails
The reordering sets agentSession.output.audio to a new DataStreamAudioOutput (line 254) before the startAgent HTTP call (line 263). If startAgent throws — either an APIStatusError for non-retryable errors (line 319) or an APIConnectionError after all retries are exhausted (line 336) — the exception propagates out of start(), but agentSession.output.audio has already been permanently replaced with a DataStreamAudioOutput targeting an avatar participant that will never join the room. The DataStreamAudioOutput will then wait indefinitely in waitForParticipant (agents/src/voice/avatar/datastream_io.ts:128) for a participant that never connects, effectively making all subsequent audio output hang. The previous code (and all other avatar plugins — hedra, trugen, bey, anam, tavus, runway) set the audio output only after the upstream session creation succeeds, avoiding this corrupted-state-on-failure issue.
(Refers to lines 254-263)
Prompt for agents
The audio output is assigned before startAgent, so if startAgent throws, agentSession.output.audio is left pointing to a DataStreamAudioOutput for a non-existent avatar participant. The fix should save the original audio output before the assignment and restore it in a catch/finally block if startAgent fails. Something like:
const previousAudio = agentSession.output.audio;
agentSession.output.audio = new voice.DataStreamAudioOutput({...});
try {
const sessionId = await this.startAgent(livekitUrl, livekitToken);
return sessionId;
} catch (e) {
agentSession.output.audio = previousAudio;
throw e;
}
Relevant files: plugins/lemonslice/src/avatar.ts (start method, lines 198-266), agents/src/voice/avatar/datastream_io.ts (DataStreamAudioOutput constructor and _start method).
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Rebind
agent_session.output.audioto the avatar-boundDataStreamAudioOutputbefore the LemonSlice session-creation HTTP call instead of after. The call is on the order of a second, and during a mid-session avatar swap anygenerate_replytriggered in that gap stays routed to the previous audio destination.wait_remote_track=KIND_VIDEObuffers frames until the new video track actually shows up, so binding early doesn't drop audio.Test plan