port over Python fallback fixes#1147
Conversation
|
| async disableInterruptionDetection(): Promise<void> { | ||
| this.isInterruptionEnabled = false; | ||
| this.interruptionDetection = undefined; | ||
| await this.interruptionTask?.cancelAndWait(); | ||
| this.interruptionTask = undefined; | ||
| await this.interruptionStreamChannel?.close(); | ||
| this.interruptionStreamChannel = undefined; | ||
| } |
There was a problem hiding this comment.
🟡 Held user transcripts silently dropped when falling back from adaptive to VAD interruption
When disableInterruptionDetection() is called from fallbackToVadInterruption(), the transcriptBuffer in AudioRecognition is never flushed. STT events held during overlap speech (agent speaking + user speaking) are permanently lost.
The issue arises because disableInterruptionDetection() sets this.isInterruptionEnabled = false without first flushing the held transcripts. After this, all code paths that could flush or clear the buffer are blocked:
flushHeldTranscripts()returns early atagents/src/voice/audio_recognition.ts:325(!this.isInterruptionEnabled)shouldHoldSttEvent()returns false atagents/src/voice/audio_recognition.ts:400, so theSTART_OF_SPEECHbranch at line 408-411 that clears the buffer is never reachedonEndOfAgentSpeech()returns early atagents/src/voice/audio_recognition.ts:267without flushing
This means if a user was speaking over the agent when the interruption detector timed out, those words are silently dropped during the transition to VAD-based interruption.
Prompt for agents
In agents/src/voice/audio_recognition.ts, the disableInterruptionDetection() method (lines 252-259) should flush the transcriptBuffer before setting isInterruptionEnabled = false. Before setting isInterruptionEnabled = false on line 253, add code to process all held events:
1. Save the current transcriptBuffer to a local variable
2. Clear transcriptBuffer and ignoreUserTranscriptUntil
3. Then set isInterruptionEnabled = false and proceed with the rest of the cleanup
4. After cleanup, iterate through the saved events and call this.onSTTEvent(ev) for each one (since isInterruptionEnabled is now false, they'll be processed normally without being re-held)
The key insight is that flushHeldTranscripts() cannot be used as-is because it checks isInterruptionEnabled. The events must be re-processed through onSTTEvent after disabling interruption so they flow through normal STT processing.
Was this helpful? React with 👍 or 👎 to provide feedback.
…ction (#1149) Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Description
Fallback to VAD when there is an error from the barge in stream. Python PR: livekit/agents#5142
Changes Made
Pre-Review Checklist
Testing
restaurant_agent.tsandrealtime_agent.tswork properly (for major changes)Additional Notes
Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.