Add retry semantics and error recovery for adaptive interruption detection by toubatbrian · Pull Request #1149 · livekit/agents-js

toubatbrian · 2026-03-19T02:16:12Z

Summary

Ports Python's recoverable/unrecoverable retry semantics for adaptive interruption detection (agents#5142) to the JS SDK, and fixes a JS-specific bug where interruption detection becomes permanently dead after a transient transport error.

Changes

Transport error classification (http_transport.ts, ws_transport.ts)

Wrap transport errors as APIError subclasses (APIStatusError, APIConnectionError, APITimeoutError) instead of generic Error — enables the retry loop to classify errors via isAPIError() and .retryable
Disable ofetch's internal retry (retry: 0) so retry logic is centralized in the stream consumer
Propagate errors via controller.error() instead of swallowing them with logger.error
Remove internal retry loop from WS ensureConnection() (retry is now handled at the stream level)
Retry loop in createInterruptionTask (audio_recognition.ts)
Restructure from a single-stream-per-session design to a while loop that creates a fresh InterruptionStreamBase on each retry attempt, matching Python's _main_task pattern
Classify caught errors: retryable APIError → emit recoverable error + retry with exponential backoff; non-retryable or retries exhausted → emit unrecoverable error + break (triggers VAD fallback)
Catch forwardTask rejections in finally to prevent unhandled rejections from breaking the retry loop
Re-inject agent-speech-started sentinel on retry — JS creates a new stream per retry (unlike Python where self._agent_speech_started persists on the instance), so the new stream's agentSpeechStarted closure variable must be re-initialized
Type and logic alignment (interruption.ts, agent_activity.ts, report.ts)
Remove false from InterruptionOptions.mode union type to match Python's Literal["adaptive", "vad"]
Simplify resolveInterruptionDetector conditions accordingly
Fix allowInterruptions computation in session report

Test plan

Build passes (pnpm build --filter agents)
Tested transient error scenario: single HTTP 500 → retry → adaptive interruption resumes and successfully detects interruptions
Tested persistent error scenario: repeated HTTP 500s → retries exhaust → emits unrecoverable error → falls back to VAD → user can still interrupt via VAD
Verified onInterruptionError correctly routes recoverable errors (log only) vs unrecoverable errors (fallback to VAD)
Error handling is transport-agnostic: both HTTP and WS transports propagate APIError subclasses through controller.error(), consumed identically by the retry loop

changeset-bot · 2026-03-19T02:16:16Z

⚠️ No Changeset found

Latest commit: 8ded378

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

toubatbrian · 2026-03-19T02:18:29Z

agents/src/voice/turn_config/interruption.ts

   * @defaultValue undefined
   */
-  mode: 'adaptive' | 'vad' | false | undefined;
+  mode: 'adaptive' | 'vad' | undefined;


Remove false to keep aligned with python impl

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f1141df03e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T02:25:33Z

agents/src/voice/audio_recognition.ts

+      if (numRetries > 0 && this.isAgentSpeaking) {
+        await stream.pushFrame(InterruptionStreamSentinel.agentSpeechStarted());
      }


Rehydrate overlap state when recreating interruption stream

On retry, the code only re-injects agent-speech-started into the new interruption stream. If the transport failure happens during an active overlap, the replacement stream never receives overlap-speech-started, so its internal overlapSpeechStarted flag stays false and incoming audio frames are not sent for adaptive inference until a brand-new overlap begins. This causes the current interruption to be missed after a transient error, which undermines the intended error-recovery behavior.

Useful? React with 👍 / 👎.

agents/src/voice/audio_recognition.ts

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>

lukasIO

apart from my other comment about catching abort before the async push, this looks good to me.

Generally not a fan of the nested while loop as it introduces a lot of logic flow through multiple break levels, but we can address this afterwards also.

…ttps://github.com/livekit/agents-js into brian/unify-interruption-transport-error-handling

devin-ai-integration

Devin Review found 1 new potential issue.

View 7 additional findings in Devin Review.

devin-ai-integration · 2026-03-19T08:42:02Z

agents/src/inference/interruption/ws_transport.ts

+      outputController?.error(
+        new APIConnectionError({ message: `WebSocket error: ${err.message}` }),
+      );


🔴 Multiple controller.error() calls in ws_transport can crash due to WHATWG Streams assertion

The PR changes the socket.on('error') handler (line 141) and transform catch block (line 381) from logging errors to calling controller.error() / outputController.error(). These join the existing MSG_ERROR handler (ws_transport.ts:277) and timeout check (ws_transport.ts:368) that already call controller.error(). If any two of these fire in sequence (e.g., sendAudioData throws APIConnectionError because ws.readyState !== WebSocket.OPEN, the transform catch calls controller.error(), and then the pending socket.on('error') event fires and calls outputController.error()), the second call attempts to error an already-errored stream. In Node.js's WHATWG Streams implementation, ReadableStreamError asserts stream.state === 'readable', so the second call throws an AssertionError that is unhandled in the event handler, crashing the process.

The old code avoided this by only logging in socket.on('error') and transform catch, so only MSG_ERROR and the timeout check could call controller.error().

Prompt for agents

In agents/src/inference/interruption/ws_transport.ts, guard all calls to outputController.error() and controller.error() against the stream already being in an errored state. One approach: introduce a boolean flag (e.g. let streamErrored = false) scoped alongside outputController at line 123. Before each controller.error() call (lines 141-143, 277-280, 368-370, 381), check if streamErrored is false, and after a successful controller.error() call, set streamErrored = true. This prevents the second controller.error() call from triggering an assertion failure in Node.js's WHATWG Streams implementation. Affected locations: - Line 141-143: socket.on('error') handler in setupMessageHandler - Line 277-280: MSG_ERROR case in handleMessage - Line 368-370: timeout check in transform - Line 381: catch block in transform

Was this helpful? React with 👍 or 👎 to provide feedback.

toubatbrian added 2 commits March 18, 2026 18:30

save

d9ea9c5

Update audio_recognition.ts

f1141df

toubatbrian requested a review from lukasIO March 19, 2026 02:16

toubatbrian requested review from a team and chenghao-mou March 19, 2026 02:16

toubatbrian commented Mar 19, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

lukasIO reviewed Mar 19, 2026

View reviewed changes

agents/src/voice/audio_recognition.ts Show resolved Hide resolved

Update agents/src/voice/audio_recognition.ts

978fb1e

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>

lukasIO approved these changes Mar 19, 2026

View reviewed changes

toubatbrian added 2 commits March 19, 2026 01:33

fix pr comments

ff82b03

Merge branch 'brian/unify-interruption-transport-error-handling' of h…

8ded378

…ttps://github.com/livekit/agents-js into brian/unify-interruption-transport-error-handling

devin-ai-integration bot reviewed Mar 19, 2026

View reviewed changes

toubatbrian merged commit 71af1c4 into chenghao/fix/barge-in-fallback Mar 19, 2026
2 checks passed

toubatbrian deleted the brian/unify-interruption-transport-error-handling branch March 19, 2026 08:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add retry semantics and error recovery for adaptive interruption detection#1149

Add retry semantics and error recovery for adaptive interruption detection#1149
toubatbrian merged 5 commits intochenghao/fix/barge-in-fallbackfrom
brian/unify-interruption-transport-error-handling

toubatbrian commented Mar 19, 2026

Uh oh!

changeset-bot bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

toubatbrian Mar 19, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Uh oh!

Uh oh!

Uh oh!

lukasIO left a comment •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

toubatbrian commented Mar 19, 2026

Summary

Changes

Test plan

Uh oh!

changeset-bot bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

toubatbrian Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lukasIO left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot bot commented Mar 19, 2026 •

edited

Loading

lukasIO left a comment •

edited

Loading