Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
4e31b27
Add turn config interfaces and defaults (#975)
lukasIO Jan 16, 2026
f3c7430
Merge branch 'main' into feat/barge-in
Toubat Jan 21, 2026
07c5d71
Add AdaptiveInterruptionDetector (#980)
lukasIO Jan 27, 2026
f1a2114
Merge branch 'main' into feat/barge-in
lukasIO Jan 27, 2026
c861f50
Add agent activity interruption detector integration (#991)
lukasIO Jan 29, 2026
1862dc3
remove aic
lukasIO Jan 29, 2026
705ed33
reuse
lukasIO Jan 29, 2026
8f53889
remove tests for legacy stream approach
lukasIO Jan 29, 2026
7d24bf0
fix util migration tests
lukasIO Jan 29, 2026
c78cf58
comment out example tests
lukasIO Jan 29, 2026
d5b271c
Rename files to underscore cases (#1007)
toubatbrian Jan 30, 2026
b020180
update date
lukasIO Jan 30, 2026
dbad1e4
update date
lukasIO Jan 30, 2026
d882012
update defaults
lukasIO Jan 30, 2026
67e8f6c
deprecate legacy options and update tests
lukasIO Jan 30, 2026
96d6b57
fix internal types
lukasIO Jan 30, 2026
2ee2748
rabbit comments
lukasIO Jan 30, 2026
62cd448
remove unused stuff
lukasIO Jan 30, 2026
ec6d9bd
more rabbit fixes
lukasIO Jan 30, 2026
016e3a4
better cleanup
lukasIO Jan 30, 2026
9a4939c
ensure inputStartedAt is set
lukasIO Jan 30, 2026
e28b1b1
Fix Inference URL parity (#1011)
toubatbrian Feb 2, 2026
4310baa
Preserve turnDetection after cloning
toubatbrian Feb 2, 2026
175e57b
respect LIVEKIT_REMOTE_EOT_URL environment variable
toubatbrian Feb 2, 2026
0682f25
refine timeout computation
toubatbrian Feb 3, 2026
a820521
Merge branch 'main' into feat/barge-in
lukasIO Feb 4, 2026
e175e7d
migrate turnhandling options on agent level
lukasIO Feb 4, 2026
9cb0a29
Create silly-donkeys-shop.md
lukasIO Feb 4, 2026
5f088b9
add explicit logging for sample rate error
lukasIO Feb 4, 2026
6fbc417
conditionally create interruption stream channel only when detection …
lukasIO Feb 4, 2026
9f2932d
propagate updated turn detection options
lukasIO Feb 4, 2026
b4a82ad
fix comment
lukasIO Feb 4, 2026
f2ac83a
Merge branch 'feat/barge-in' of github.com:livekit/agents-js into fea…
lukasIO Feb 4, 2026
ec26bb1
fix tests
lukasIO Feb 4, 2026
b5c541f
migrate allowInterruptions
lukasIO Feb 4, 2026
76bd4e8
fix inputStartedAt assignment
lukasIO Feb 4, 2026
245bc66
Session Usage Collection (#1014)
toubatbrian Feb 5, 2026
ea27278
resolve comments
toubatbrian Feb 5, 2026
63eccca
Merge branch 'main' into feat/barge-in
toubatbrian Feb 5, 2026
5bc7108
Merge branch 'main' into feat/barge-in
toubatbrian Feb 9, 2026
171fb98
Update audio_recognition.ts
toubatbrian Feb 9, 2026
bb93420
change config unit to ms
toubatbrian Feb 10, 2026
717e908
Add Metrics to OTEL Chat History (#1037)
toubatbrian Feb 10, 2026
b935b0d
keep deprecated field for session report
toubatbrian Feb 10, 2026
74e42fa
Merge branch 'feat/barge-in' of https://github.com/livekit/agents-js …
toubatbrian Feb 10, 2026
eaafc14
Merge branch 'main' into feat/barge-in
lukasIO Feb 12, 2026
8d14806
Port barge in fixes from python implementation (#1047)
lukasIO Feb 17, 2026
cfe0362
fix typo
lukasIO Feb 17, 2026
ff277d5
Align agents-js with Python PR #4834 refactor and follow-up fixes (#1…
toubatbrian Feb 21, 2026
b0dbbf5
Remote Session Events (#1073)
toubatbrian Feb 26, 2026
1caed4f
Bargein Model Metrics Usages (#1079)
toubatbrian Feb 27, 2026
7138bc9
Merge branch 'main' into feat/barge-in
toubatbrian Feb 27, 2026
dad12f8
Merge branch 'feat/barge-in' of https://github.com/livekit/agents-js …
toubatbrian Feb 27, 2026
9ae1e9a
resolve conflicts
toubatbrian Feb 27, 2026
f58454b
format code
toubatbrian Feb 27, 2026
dbff0f3
Make test working
toubatbrian Mar 2, 2026
44684ac
fix lint and tests
toubatbrian Mar 2, 2026
66a6b85
Merge branch 'main' into feat/barge-in
toubatbrian Mar 2, 2026
7ef1ecf
Lint
toubatbrian Mar 2, 2026
0956efc
Merge branch 'main' into feat/barge-in
toubatbrian Mar 4, 2026
db163f3
Update interruption_detector.ts
toubatbrian Mar 5, 2026
1cf1aa4
Attach user_speaking span (#1113)
toubatbrian Mar 9, 2026
7cd3e9d
Merge branch 'main' into feat/barge-in
toubatbrian Mar 12, 2026
1639b0d
add claude md for 1.1.x
toubatbrian Mar 14, 2026
cfb3893
merge
toubatbrian Mar 19, 2026
e652ee5
Merge branch 'main' into brian/claude-md
toubatbrian Mar 19, 2026
74759ff
remove changesets
toubatbrian Mar 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions .changeset/fair-beers-wave.md

This file was deleted.

2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,8 @@ examples/src/test_*.ts
!CONTRIBUTING.md
!.CODE_OF_CONDUCT.md

!**/CLAUDE.md

# OpenTelemetry trace test output
.traces/
*.traces.json
136 changes: 136 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file CLAUDE.md (root)

The root CLAUDE.md is a new file but lacks the SPDX license header required by CONTRIBUTING.md ("When creating a new file, make sure to add SPDX headers for REUSE-3.2 compliance") and by the CLAUDE.md Code Conventions section itself. All existing markdown files in the repo (e.g., CONTRIBUTING.md:1-5, README.md:1-5, CODE_OF_CONDUCT.md:1-5) include an HTML-comment SPDX header. Additionally, REUSE.toml has no annotation covering **/CLAUDE.md, so these files will fail the REUSE/SPDX CI check.

Prompt for agents
Add SPDX license headers to all 14 new CLAUDE.md files. Each file should start with:

<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.

SPDX-License-Identifier: Apache-2.0
-->

before the existing # CLAUDE.md heading.

The affected files are:
- CLAUDE.md
- agents/src/inference/CLAUDE.md
- agents/src/ipc/CLAUDE.md
- agents/src/llm/CLAUDE.md
- agents/src/metrics/CLAUDE.md
- agents/src/stream/CLAUDE.md
- agents/src/stt/CLAUDE.md
- agents/src/telemetry/CLAUDE.md
- agents/src/tokenize/CLAUDE.md
- agents/src/tts/CLAUDE.md
- agents/src/voice/CLAUDE.md
- agents/src/voice/room_io/CLAUDE.md
- agents/src/voice/transcription/CLAUDE.md
- agents/src/voice/turn_config/CLAUDE.md

Alternatively, add a REUSE.toml annotation entry:

[[annotations]]
path = ["**/CLAUDE.md"]
SPDX-FileCopyrightText = "2026 LiveKit, Inc."
SPDX-License-Identifier = "Apache-2.0"

Either approach satisfies REUSE-3.2 compliance. Using both (inline header + REUSE.toml) follows the pattern used for README.md files.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

LiveKit Agents for Node.js — a TypeScript framework for building realtime, multimodal, and voice AI agents that run on servers. This is the Node.js distribution of the [LiveKit Agents framework](https://github.com/livekit/agents) (originally Python).

## Monorepo Structure

- **`agents/`** — Core framework (`@livekit/agents`). Contains agent orchestration, LLM/STT/TTS abstractions, voice pipeline, metrics, IPC/process pooling, and the CLI.
- **`plugins/`** — Provider plugins (`@livekit/agents-plugin-*`). Each implements one or more of: LLM, STT, TTS, VAD, EOU (end-of-utterance), or Avatar.
- **`examples/`** — Example agents (private, not published). Run with `pnpm dlx tsx ./examples/src/<file>.ts dev`.

**Tooling:** pnpm 9.7.0 workspaces, Turborepo for builds, tsup for bundling (CJS + ESM), TypeScript 5.4+, Vitest for tests, Changesets for versioning.

## Common Commands

```bash
pnpm build # Build all packages (turbo)
pnpm clean:build # Clean dist/ dirs then rebuild
pnpm test # Run all tests (vitest)
pnpm test -- --testPathPattern=agents/src/llm # Run tests by path
pnpm test -- --testNamePattern="chat context" # Run tests by name
pnpm test:watch # Watch mode
pnpm lint # ESLint all packages
pnpm lint:fix # ESLint with auto-fix
pnpm format:check # Prettier check
pnpm format:write # Prettier format
pnpm api:check # API Extractor validation
pnpm api:update # Update API declarations
```

### Running an example agent

```bash
pnpm build && pnpm dlx tsx ./examples/src/basic_agent.ts dev --log-level=debug
```

Required env vars: `LIVEKIT_URL`, `LIVEKIT_API_KEY`, `LIVEKIT_API_SECRET`, plus provider keys (e.g. `OPENAI_API_KEY`).

### Debugging individual plugins

Create a test file prefixed with `test_` in `examples/src/`. No `defineAgent` wrapper needed — just import the plugin directly and run:

```bash
pnpm build && node ./examples/src/test_my_plugin.ts
```

## Architecture

Each module under `agents/src/` has its own `CLAUDE.md` with detailed architecture notes. High-level overview:

- **Voice pipeline** (`voice/`): Audio In → VAD → STT → LLM → TTS → Audio Out. `AgentSession` orchestrates, `AgentActivity` manages state machine. `defineAgent({ prewarm, entry })` is the entrypoint pattern.
- **LLM** (`llm/`): `ChatContext` (chronologically ordered), `ChatMessage`, tool calling with Zod schemas, `handoff()` for multi-agent transfers. Provider format adapters for OpenAI and Google.
- **STT** (`stt/`): `SpeechStream` with automatic retry. `StreamAdapter` converts non-streaming STT + VAD to streaming.
- **TTS** (`tts/`): `SynthesizeStream`, `ChunkedStream`. `FallbackAdapter` for multi-provider failover. `StreamAdapter` for non-streaming providers.
- **VAD** (`vad.ts`): Voice Activity Detection interface. Silero plugin is the primary implementation.
- **Inference** (`inference/`): LiveKit Inference Gateway clients. Always use full `provider/model` format (e.g., `'openai/gpt-4o-mini'`).
- **Stream** (`stream/`): Composable Web Streams API primitives (`StreamChannel`, `DeferredStream`, `MultiInputStream`).
- **IPC** (`ipc/`): Process pool for running agents in child processes. Two-way IPC: child sends inference requests back to parent.
- **Worker** (`worker.ts`): Main process connecting to LiveKit server, receives job assignments, spawns agent processes.
- **Plugins** (`livekit-plugins/`): Each extends `Plugin` base class. Pattern: `@livekit/agents-plugin-<provider>`. Exports typed implementations (e.g., `openai.LLM`, `deepgram.STT`).

## Code Conventions

- **License header** required on every new file:
```
// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
//
// SPDX-License-Identifier: Apache-2.0
```
- **Prettier**: single quotes, trailing commas, 100 char width, sorted imports.
- **ESLint**: `@typescript-eslint` with strict rules. Prefix unused vars with `_`. Use `type` imports (`consistent-type-imports`).
- **TypeScript**: strict mode, `noUncheckedIndexedAccess`, `verbatimModuleSyntax`, target ES2022, module node16.
- **Time units**: Use milliseconds for all time-based values by default. Only use seconds when the name explicitly ends with `InS`.
- **Changesets**: All packages in `agents/` and `plugins/` release together (fixed versioning). Run `pnpm changeset` to add a changeset before PRing. The examples package is ignored.
- **API Extractor**: Public API surface is tracked. Run `pnpm api:check` after changing exports and `pnpm api:update` to update declarations.

## Testing

- **Framework**: Vitest with 5s default timeout.
- **Pattern**: `*.test.ts` files co-located with source.
- **Snapshots**: Used in LLM chat/tool context tests (`agents/src/llm/__snapshots__/`).
- **Inference LLM tests**: Always use full model names from `agents/src/inference/models.ts` (e.g. `'openai/gpt-4o-mini'`, not `'gpt-4o-mini'`). Initialize logger first: `initializeLogger({ pretty: true })`.

## Porting from Python (`livekit-agents`)

When porting features or fixes from the Python `livekit-agents` repo to this JS/TS repo, follow these rules:

### 1. Python reference comments (`// Ref`)

Every JS change that corresponds to a Python change must carry an inline reference comment directly above the relevant line(s):

```ts
// Ref: python <relative-file-path> - <line-range> lines
```

Examples:

```ts
// Ref: python livekit-agents/livekit/agents/voice/agent_session.py - 362-369 lines
private _aecWarmupRemaining = 0;

// Ref: python livekit-agents/livekit/agents/voice/agent_activity.py - 1236-1240 lines
if (this.agentSession._aecWarmupRemaining > 0) { ... }
```

Use the Python file path relative to the repo root. Include the line range from the Python diff so reviewers can cross-reference directly.

### 2. Time unit unification

Python uses **seconds** (`float`) for all time values. JS/TS uses **milliseconds** (`number`) by default.

When porting a Python time parameter:

- Multiply the Python default by `1000` for the JS default (e.g. `3.0 s` → `3000 ms`)
- Use `setTimeout` / `clearTimeout` directly with the ms value — do **not** multiply by `1000` at call sites
- Name the field in plain form (e.g. `aecWarmupDuration`, `userAwayTimeout`) — the ms convention is implied
- Only use seconds as the unit if the variable name explicitly ends with `InS` (e.g. `delayInS`)

Example mapping:

| Python | JS/TS |
| ------------------------------------------------- | ------------------------------------------ |
| `aec_warmup_duration: float = 3.0` | `aecWarmupDuration: number \| null = 3000` |
| `user_away_timeout: float = 15.0` | `userAwayTimeout: number \| null = 15000` |
| `loop.call_later(self._aec_warmup_remaining, cb)` | `setTimeout(cb, this._aecWarmupRemaining)` |

## CI Requirements

- REUSE/SPDX license compliance
- ESLint passes
- Prettier formatting passes
- Full build succeeds
- Base branch: `main`
21 changes: 21 additions & 0 deletions agents/src/inference/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/inference/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


LiveKit Inference Gateway clients for LLM, STT, and TTS. Provides unified interface over LiveKit's cloud inference service.

## Key Classes

- **LLM** — OpenAI-compatible client pointing at LiveKit Inference Gateway. Dynamic JWT token generation for auth. Supports provider format adapters (OpenAI, Google).
- **STT** — WebSocket-based STT client. Streams audio as base64 frames in 50ms chunks. Supports live model/language switching via reconnect events.
- **TTS** — WebSocket-based TTS client (sibling pattern to STT).

## Non-Obvious Patterns

- **Model strings must be `provider/model` format**: e.g., `'openai/gpt-4o-mini'`, `'deepgram/nova-3'`. Never just `'gpt-4o-mini'`.
- **STT language parsing**: Parses `model:language` from the model string (e.g., `'deepgram/nova-3:en'`).
- **STT fallback chains**: If primary model fails, gateway tries fallback models in order.
- **Zod validation**: All gateway protocol messages validated with Zod schemas in `api_protos.ts`.
- **Google thought_signature**: LLMStream preserves `thoughtSignature` across parallel tool calls in a batch, only resets at end of response.

## Subdirectory

- `interruption/` — Advanced interrupt detection logic (ML-based adaptive detector).
22 changes: 22 additions & 0 deletions agents/src/ipc/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/ipc/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


Inter-process communication for running agents in child Node.js processes.

## Key Classes

- **ProcPool** — Manages warm process pool. Pre-spawns processes and queues them for reuse. Uses `MultiMutex` to control warm process count.
- **SupervisedProc** — Base class for child process lifecycle: health monitoring (ping/pong), memory limits (warns at threshold, kills at limit), graceful shutdown.
- **JobProcExecutor** — Extends SupervisedProc. Forks child process for job execution. Handles inference requests from child by delegating to parent's `InferenceExecutor`.
- **InferenceExecutor** — Interface with single `doInference(method, data)` method. Runs in parent process to share GPU/model resources.

## IPC Protocol

Strongly-typed message union in `message.ts`: `initializeRequest/Response`, `pingRequest/pongResponse`, `startJobRequest`, `shutdownRequest`, `inferenceRequest/Response`, `exiting`, `done`.

## Non-Obvious Patterns

- **Two-way IPC**: Child sends inference requests → parent executes with shared models → parent sends results back. This avoids loading models in every child process.
- **TypeScript child process**: `createProcess()` detects TS files and passes appropriate `execArgv` so the TS loader works in the child.
- **Future-based sync**: `init` and `join` Futures prevent race conditions during process startup and shutdown.
- **Graceful shutdown**: Sends `shutdownRequest`, waits up to `closeTimeout`, then forceful `kill()`.
- **Only `InferenceExecutor` is publicly exported** — the rest is internal.
36 changes: 36 additions & 0 deletions agents/src/llm/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/llm/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


LLM integration: chat context management, tool/function calling, provider format adapters, and realtime model abstractions.

## Key Classes

- **ChatContext** — Ordered container of `ChatItem` (ChatMessage | FunctionCall | FunctionCallOutput | AgentHandoffItem). Items sorted by `createdAt` timestamp, enabling out-of-order insertion.
- **ChatMessage** — Single message with polymorphic content (string, ImageContent, AudioContent). Role: 'developer' | 'system' | 'user' | 'assistant'.
- **FunctionCall / FunctionCallOutput** — Tool invocation and result, matched by `callId`. FunctionCall has `groupId` for parallel calls and `thoughtSignature` for Gemini thinking mode.
- **ReadonlyChatContext** — Immutable wrapper that throws on mutation. Used in callbacks.
- **LLM / LLMStream** — Abstract base classes for all LLM plugins. LLMStream handles retry with exponential backoff and metrics (TTFT, token counts).
- **RealtimeModel / RealtimeSession** — Abstractions for streaming/realtime APIs (e.g., OpenAI Realtime).

## Tool System (`tool_context.ts`)

- `tool({ description, parameters, execute })` — Factory function. Parameters accept Zod v3, Zod v4, or raw JSON Schema.
- `handoff({ agent, returns })` — Return from tool to transfer to another agent.
- **Symbol-based type markers**: Tools use private symbols (`TOOL_SYMBOL`, `FUNCTION_TOOL_SYMBOL`, etc.) for runtime discrimination — prevents spoofing.
- **ToolOptions**: Tools receive `{ ctx: RunContext<UserData>, toolCallId, abortSignal }`.

## Provider Format Adapters (`provider_format/`)

Three formats: `'openai'`, `'openai.responses'`, `'google'`.

- **`groupToolCalls()`** — Core algorithm shared by all adapters. Groups assistant messages with their tool calls and outputs by ID/groupId.
- **OpenAI**: Standard chat completions format with `tool_calls` array and `tool` role responses.
- **Google**: Turn-based with parts array. System messages extracted separately. Injects dummy user message (`.`) if last turn isn't user (Gemini requirement). Preserves `thoughtSignature` for thinking-mode models.
- **Image caching**: `ImageContent._cache` stores serialized versions to avoid re-encoding across provider conversions.

## Non-Obvious Patterns

- **Chronological insertion**: `ChatContext` maintains sorted order by `createdAt`. Late-arriving items (e.g., streamed chunks with timestamps) are inserted in correct position.
- **LCS-based diff**: `computeChatCtxDiff()` uses longest common subsequence for minimal create/remove operations — used by `RemoteChatContext` for IPC sync.
- **RemoteChatContext**: Linked-list based context for incremental updates. Insert by previous item ID, convert back via `toChatCtx()`.
- **Zod dual-version**: `zod-utils.ts` auto-detects Zod v3 (`_def.typeName`) vs v4 (`_zod` property) and routes schema conversion accordingly.
- **FallbackAdapter**: Multi-LLM failover with availability tracking, recovery tasks, and `availability_changed` events.
14 changes: 14 additions & 0 deletions agents/src/metrics/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/metrics/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


Per-model/provider usage tracking and aggregation for billing and analytics.

## Key Components

- **ModelUsageCollector** — Aggregates metrics by `provider:model` key. Handles both standard LLM metrics and RealtimeModel metrics (with token detail breakdowns: text, image, audio, cached).
- **Usage types**: `LLMModelUsage`, `TTSModelUsage`, `STTModelUsage`, `InterruptionModelUsage` — each with provider-specific fields.
- **`filterZeroValues()`** — Strips zero-valued fields from usage objects for clean JSON output.

## Non-Obvious Patterns

- **Session duration tracking**: Some models (xAI) bill by session duration rather than tokens — tracked in `sessionDurationMs`.
- **UsageCollector is deprecated** — use `ModelUsageCollector` instead.
17 changes: 17 additions & 0 deletions agents/src/stream/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/stream/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


Low-level async stream composition primitives built on the Web Streams API (`ReadableStream`, `WritableStream`, `TransformStream`).

## Key Classes

- **StreamChannel<T, E>** — Bidirectional stream: write to it, read from it. `addStreamInput()` launches async reader loops to pipe external streams in.
- **DeferredReadableStream<T>** — Readable stream where the actual source is set later via `setSource()`. Supports detach/reattach.
- **MultiInputStream<T>** — Fan-in multiplexer: N dynamic inputs → 1 output. Inputs can be added/removed at runtime. Output stays open after all inputs end (waits for new inputs).
- **IdentityTransform<T>** — Pass-through `TransformStream` with `highWaterMark` set to `MAX_SAFE_INTEGER` to prevent backpressure.
- **mergeReadableStreams()** — Functional merge of N streams (adapted from Deno). If one errors, merged output closes.

## Non-Obvious Patterns

- **IdentityTransform high water mark**: Intentionally disables backpressure on both sides. This follows the Python agents `channel.py` pattern — needed for concurrent sources.
- **Reader lock cleanup**: TypeErrors from releasing already-released locks are caught and ignored throughout. This is intentional.
- **MultiInputStream resilience**: Errors in one input don't kill the output stream. Failed inputs are removed silently.
24 changes: 24 additions & 0 deletions agents/src/stt/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/stt/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


Speech-to-text abstractions with streaming, VAD-based adapters, and automatic retry.

## Key Classes

- **STT** — Abstract base. Subclasses implement `_recognize()` (one-shot) and `stream()` (streaming). Emits `metrics_collected` and `error` events.
- **SpeechStream** — Async iterable consuming audio frames via `pushFrame()`, yielding `SpeechEvent` objects. Handles audio resampling internally if sample rates don't match.
- **StreamAdapter** — Wraps a non-streaming STT + VAD to create a streaming interface. Buffers audio during speech, calls `recognize()` on end-of-speech.

## Architecture

```
pushFrame() → AudioResampler (if needed) → AsyncIterableQueue → run() (provider impl) → output queue → consumer
```

## Non-Obvious Patterns

- **Dual queue architecture**: Input queue, intermediate queue (for metrics monitoring), and output queue run concurrently.
- **FLUSH_SENTINEL**: Private static symbol signals flush operations internally without creating actual events.
- **startSoon() in constructor**: Defers `mainTask()` until after constructor completes to avoid accessing uninitialized fields.
- **Resampler created on-demand**: Only instantiated when first frame with different sample rate arrives.
- **Retry with exponential backoff**: `mainTask()` retries on `APIError`/`APIConnectionError`; other errors are immediately fatal.
- **startTimeOffset**: Can offset transcription timestamps for stream resumption.
16 changes: 16 additions & 0 deletions agents/src/telemetry/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/telemetry/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


OpenTelemetry integration for distributed tracing, logging, and session report uploads.

## Key Components

- **DynamicTracer** — Runtime-switchable tracer provider wrapper. Global instance exported as `tracer`.
- **setupCloudTracer()** — Complete cloud observability setup: OTLP exporter, metadata span processor, Pino cloud log exporter. Uses JWT for auth.
- **uploadSessionReport()** — Uploads chat history (JSON), metrics (protobuf header), and audio (OGG) to LiveKit Cloud via multipart FormData.
- **MetadataLogProcessor / ExtraDetailsProcessor** — Inject room_id, job_id, logger names into all log records.

## Non-Obvious Patterns

- **Monotonic timestamp ordering**: Session report adds 1μs offsets to colliding timestamps to ensure correct dashboard display ordering.
- **Dynamic tracer provider**: Can change tracer provider mid-session (used when cloud connection establishes after startup).
- **Metadata injection**: All spans automatically tagged with room_id and job_id via `MetadataSpanProcessor`.
15 changes: 15 additions & 0 deletions agents/src/tokenize/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# CLAUDE.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing SPDX license header on new file agents/src/tokenize/CLAUDE.md

New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any REUSE.toml annotation either, which will fail the REUSE/SPDX CI check.

Suggested change
# CLAUDE.md
<!--
SPDX-FileCopyrightText: 2026 LiveKit, Inc.
SPDX-License-Identifier: Apache-2.0
-->
# CLAUDE.md
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


Streaming text tokenization for real-time TTS. Incrementally splits text into sentences or words with configurable buffering.

## Key Classes

- **SentenceTokenizer / WordTokenizer** — Abstract bases with `tokenize()` (batch) and `stream()` (streaming) methods.
- **BufferedTokenStream** — Core streaming implementation. Buffers input until `minContextLength`, then tokenizes and holds output until `minTokenLength` before emitting. Each `flush()` generates a new `segmentId`.
- **Basic implementations** (`basic/`) — Default English tokenizers using rule-based sentence/word splitting. Includes hyphenation support.

## Non-Obvious Patterns

- **Designed for TTS pipeline**: Text arrives incrementally from LLM streaming. Tokenizer buffers enough context for accurate sentence boundaries before emitting.
- **Tuple tokens**: Some tokenizers return `[text, startPos, endPos]` tuples for position tracking, not just strings.
- **Segment tracking**: `flush()` creates new segment IDs, allowing consumers to distinguish continuous speech from intentional breaks.
Loading
Loading