-
Notifications
You must be signed in to change notification settings - Fork 269
Add claude md for 1.1.x #1131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add claude md for 1.1.x #1131
Changes from all commits
4e31b27
f3c7430
07c5d71
f1a2114
c861f50
1862dc3
705ed33
8f53889
7d24bf0
c78cf58
d5b271c
b020180
dbad1e4
d882012
67e8f6c
96d6b57
2ee2748
62cd448
ec6d9bd
016e3a4
9a4939c
e28b1b1
4310baa
175e57b
0682f25
a820521
e175e7d
9cb0a29
5f088b9
6fbc417
9f2932d
b4a82ad
f2ac83a
ec26bb1
b5c541f
76bd4e8
245bc66
ea27278
63eccca
5bc7108
171fb98
bb93420
717e908
b935b0d
74e42fa
eaafc14
8d14806
cfe0362
ff277d5
b0dbbf5
1caed4f
7138bc9
dad12f8
9ae1e9a
f58454b
dbff0f3
44684ac
66a6b85
7ef1ecf
0956efc
db163f3
1cf1aa4
7cd3e9d
1639b0d
cfb3893
e652ee5
74759ff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| # CLAUDE.md | ||
|
|
||
| This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. | ||
|
|
||
| ## Project Overview | ||
|
|
||
| LiveKit Agents for Node.js — a TypeScript framework for building realtime, multimodal, and voice AI agents that run on servers. This is the Node.js distribution of the [LiveKit Agents framework](https://github.com/livekit/agents) (originally Python). | ||
|
|
||
| ## Monorepo Structure | ||
|
|
||
| - **`agents/`** — Core framework (`@livekit/agents`). Contains agent orchestration, LLM/STT/TTS abstractions, voice pipeline, metrics, IPC/process pooling, and the CLI. | ||
| - **`plugins/`** — Provider plugins (`@livekit/agents-plugin-*`). Each implements one or more of: LLM, STT, TTS, VAD, EOU (end-of-utterance), or Avatar. | ||
| - **`examples/`** — Example agents (private, not published). Run with `pnpm dlx tsx ./examples/src/<file>.ts dev`. | ||
|
|
||
| **Tooling:** pnpm 9.7.0 workspaces, Turborepo for builds, tsup for bundling (CJS + ESM), TypeScript 5.4+, Vitest for tests, Changesets for versioning. | ||
|
|
||
| ## Common Commands | ||
|
|
||
| ```bash | ||
| pnpm build # Build all packages (turbo) | ||
| pnpm clean:build # Clean dist/ dirs then rebuild | ||
| pnpm test # Run all tests (vitest) | ||
| pnpm test -- --testPathPattern=agents/src/llm # Run tests by path | ||
| pnpm test -- --testNamePattern="chat context" # Run tests by name | ||
| pnpm test:watch # Watch mode | ||
| pnpm lint # ESLint all packages | ||
| pnpm lint:fix # ESLint with auto-fix | ||
| pnpm format:check # Prettier check | ||
| pnpm format:write # Prettier format | ||
| pnpm api:check # API Extractor validation | ||
| pnpm api:update # Update API declarations | ||
| ``` | ||
|
|
||
| ### Running an example agent | ||
|
|
||
| ```bash | ||
| pnpm build && pnpm dlx tsx ./examples/src/basic_agent.ts dev --log-level=debug | ||
| ``` | ||
|
|
||
| Required env vars: `LIVEKIT_URL`, `LIVEKIT_API_KEY`, `LIVEKIT_API_SECRET`, plus provider keys (e.g. `OPENAI_API_KEY`). | ||
|
|
||
| ### Debugging individual plugins | ||
|
|
||
| Create a test file prefixed with `test_` in `examples/src/`. No `defineAgent` wrapper needed — just import the plugin directly and run: | ||
|
|
||
| ```bash | ||
| pnpm build && node ./examples/src/test_my_plugin.ts | ||
| ``` | ||
|
|
||
| ## Architecture | ||
|
|
||
| Each module under `agents/src/` has its own `CLAUDE.md` with detailed architecture notes. High-level overview: | ||
|
|
||
| - **Voice pipeline** (`voice/`): Audio In → VAD → STT → LLM → TTS → Audio Out. `AgentSession` orchestrates, `AgentActivity` manages state machine. `defineAgent({ prewarm, entry })` is the entrypoint pattern. | ||
| - **LLM** (`llm/`): `ChatContext` (chronologically ordered), `ChatMessage`, tool calling with Zod schemas, `handoff()` for multi-agent transfers. Provider format adapters for OpenAI and Google. | ||
| - **STT** (`stt/`): `SpeechStream` with automatic retry. `StreamAdapter` converts non-streaming STT + VAD to streaming. | ||
| - **TTS** (`tts/`): `SynthesizeStream`, `ChunkedStream`. `FallbackAdapter` for multi-provider failover. `StreamAdapter` for non-streaming providers. | ||
| - **VAD** (`vad.ts`): Voice Activity Detection interface. Silero plugin is the primary implementation. | ||
| - **Inference** (`inference/`): LiveKit Inference Gateway clients. Always use full `provider/model` format (e.g., `'openai/gpt-4o-mini'`). | ||
| - **Stream** (`stream/`): Composable Web Streams API primitives (`StreamChannel`, `DeferredStream`, `MultiInputStream`). | ||
| - **IPC** (`ipc/`): Process pool for running agents in child processes. Two-way IPC: child sends inference requests back to parent. | ||
| - **Worker** (`worker.ts`): Main process connecting to LiveKit server, receives job assignments, spawns agent processes. | ||
| - **Plugins** (`livekit-plugins/`): Each extends `Plugin` base class. Pattern: `@livekit/agents-plugin-<provider>`. Exports typed implementations (e.g., `openai.LLM`, `deepgram.STT`). | ||
|
|
||
| ## Code Conventions | ||
|
|
||
| - **License header** required on every new file: | ||
| ``` | ||
| // SPDX-FileCopyrightText: 2026 LiveKit, Inc. | ||
| // | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
| ``` | ||
| - **Prettier**: single quotes, trailing commas, 100 char width, sorted imports. | ||
| - **ESLint**: `@typescript-eslint` with strict rules. Prefix unused vars with `_`. Use `type` imports (`consistent-type-imports`). | ||
| - **TypeScript**: strict mode, `noUncheckedIndexedAccess`, `verbatimModuleSyntax`, target ES2022, module node16. | ||
| - **Time units**: Use milliseconds for all time-based values by default. Only use seconds when the name explicitly ends with `InS`. | ||
| - **Changesets**: All packages in `agents/` and `plugins/` release together (fixed versioning). Run `pnpm changeset` to add a changeset before PRing. The examples package is ignored. | ||
| - **API Extractor**: Public API surface is tracked. Run `pnpm api:check` after changing exports and `pnpm api:update` to update declarations. | ||
|
|
||
| ## Testing | ||
|
|
||
| - **Framework**: Vitest with 5s default timeout. | ||
| - **Pattern**: `*.test.ts` files co-located with source. | ||
| - **Snapshots**: Used in LLM chat/tool context tests (`agents/src/llm/__snapshots__/`). | ||
| - **Inference LLM tests**: Always use full model names from `agents/src/inference/models.ts` (e.g. `'openai/gpt-4o-mini'`, not `'gpt-4o-mini'`). Initialize logger first: `initializeLogger({ pretty: true })`. | ||
|
|
||
| ## Porting from Python (`livekit-agents`) | ||
|
|
||
| When porting features or fixes from the Python `livekit-agents` repo to this JS/TS repo, follow these rules: | ||
|
|
||
| ### 1. Python reference comments (`// Ref`) | ||
|
|
||
| Every JS change that corresponds to a Python change must carry an inline reference comment directly above the relevant line(s): | ||
|
|
||
| ```ts | ||
| // Ref: python <relative-file-path> - <line-range> lines | ||
| ``` | ||
|
|
||
| Examples: | ||
|
|
||
| ```ts | ||
| // Ref: python livekit-agents/livekit/agents/voice/agent_session.py - 362-369 lines | ||
| private _aecWarmupRemaining = 0; | ||
|
|
||
| // Ref: python livekit-agents/livekit/agents/voice/agent_activity.py - 1236-1240 lines | ||
| if (this.agentSession._aecWarmupRemaining > 0) { ... } | ||
| ``` | ||
|
|
||
| Use the Python file path relative to the repo root. Include the line range from the Python diff so reviewers can cross-reference directly. | ||
|
|
||
| ### 2. Time unit unification | ||
|
|
||
| Python uses **seconds** (`float`) for all time values. JS/TS uses **milliseconds** (`number`) by default. | ||
|
|
||
| When porting a Python time parameter: | ||
|
|
||
| - Multiply the Python default by `1000` for the JS default (e.g. `3.0 s` → `3000 ms`) | ||
| - Use `setTimeout` / `clearTimeout` directly with the ms value — do **not** multiply by `1000` at call sites | ||
| - Name the field in plain form (e.g. `aecWarmupDuration`, `userAwayTimeout`) — the ms convention is implied | ||
| - Only use seconds as the unit if the variable name explicitly ends with `InS` (e.g. `delayInS`) | ||
|
|
||
| Example mapping: | ||
|
|
||
| | Python | JS/TS | | ||
| | ------------------------------------------------- | ------------------------------------------ | | ||
| | `aec_warmup_duration: float = 3.0` | `aecWarmupDuration: number \| null = 3000` | | ||
| | `user_away_timeout: float = 15.0` | `userAwayTimeout: number \| null = 15000` | | ||
| | `loop.call_later(self._aec_warmup_remaining, cb)` | `setTimeout(cb, this._aecWarmupRemaining)` | | ||
|
|
||
| ## CI Requirements | ||
|
|
||
| - REUSE/SPDX license compliance | ||
| - ESLint passes | ||
| - Prettier formatting passes | ||
| - Full build succeeds | ||
| - Base branch: `main` | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,21 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/inference/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| LiveKit Inference Gateway clients for LLM, STT, and TTS. Provides unified interface over LiveKit's cloud inference service. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Classes | ||||||||||||||||||
|
|
||||||||||||||||||
| - **LLM** — OpenAI-compatible client pointing at LiveKit Inference Gateway. Dynamic JWT token generation for auth. Supports provider format adapters (OpenAI, Google). | ||||||||||||||||||
| - **STT** — WebSocket-based STT client. Streams audio as base64 frames in 50ms chunks. Supports live model/language switching via reconnect events. | ||||||||||||||||||
| - **TTS** — WebSocket-based TTS client (sibling pattern to STT). | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **Model strings must be `provider/model` format**: e.g., `'openai/gpt-4o-mini'`, `'deepgram/nova-3'`. Never just `'gpt-4o-mini'`. | ||||||||||||||||||
| - **STT language parsing**: Parses `model:language` from the model string (e.g., `'deepgram/nova-3:en'`). | ||||||||||||||||||
| - **STT fallback chains**: If primary model fails, gateway tries fallback models in order. | ||||||||||||||||||
| - **Zod validation**: All gateway protocol messages validated with Zod schemas in `api_protos.ts`. | ||||||||||||||||||
| - **Google thought_signature**: LLMStream preserves `thoughtSignature` across parallel tool calls in a batch, only resets at end of response. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Subdirectory | ||||||||||||||||||
|
|
||||||||||||||||||
| - `interruption/` — Advanced interrupt detection logic (ML-based adaptive detector). | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,22 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/ipc/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| Inter-process communication for running agents in child Node.js processes. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Classes | ||||||||||||||||||
|
|
||||||||||||||||||
| - **ProcPool** — Manages warm process pool. Pre-spawns processes and queues them for reuse. Uses `MultiMutex` to control warm process count. | ||||||||||||||||||
| - **SupervisedProc** — Base class for child process lifecycle: health monitoring (ping/pong), memory limits (warns at threshold, kills at limit), graceful shutdown. | ||||||||||||||||||
| - **JobProcExecutor** — Extends SupervisedProc. Forks child process for job execution. Handles inference requests from child by delegating to parent's `InferenceExecutor`. | ||||||||||||||||||
| - **InferenceExecutor** — Interface with single `doInference(method, data)` method. Runs in parent process to share GPU/model resources. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## IPC Protocol | ||||||||||||||||||
|
|
||||||||||||||||||
| Strongly-typed message union in `message.ts`: `initializeRequest/Response`, `pingRequest/pongResponse`, `startJobRequest`, `shutdownRequest`, `inferenceRequest/Response`, `exiting`, `done`. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **Two-way IPC**: Child sends inference requests → parent executes with shared models → parent sends results back. This avoids loading models in every child process. | ||||||||||||||||||
| - **TypeScript child process**: `createProcess()` detects TS files and passes appropriate `execArgv` so the TS loader works in the child. | ||||||||||||||||||
| - **Future-based sync**: `init` and `join` Futures prevent race conditions during process startup and shutdown. | ||||||||||||||||||
| - **Graceful shutdown**: Sends `shutdownRequest`, waits up to `closeTimeout`, then forceful `kill()`. | ||||||||||||||||||
| - **Only `InferenceExecutor` is publicly exported** — the rest is internal. | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,36 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/llm/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| LLM integration: chat context management, tool/function calling, provider format adapters, and realtime model abstractions. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Classes | ||||||||||||||||||
|
|
||||||||||||||||||
| - **ChatContext** — Ordered container of `ChatItem` (ChatMessage | FunctionCall | FunctionCallOutput | AgentHandoffItem). Items sorted by `createdAt` timestamp, enabling out-of-order insertion. | ||||||||||||||||||
| - **ChatMessage** — Single message with polymorphic content (string, ImageContent, AudioContent). Role: 'developer' | 'system' | 'user' | 'assistant'. | ||||||||||||||||||
| - **FunctionCall / FunctionCallOutput** — Tool invocation and result, matched by `callId`. FunctionCall has `groupId` for parallel calls and `thoughtSignature` for Gemini thinking mode. | ||||||||||||||||||
| - **ReadonlyChatContext** — Immutable wrapper that throws on mutation. Used in callbacks. | ||||||||||||||||||
| - **LLM / LLMStream** — Abstract base classes for all LLM plugins. LLMStream handles retry with exponential backoff and metrics (TTFT, token counts). | ||||||||||||||||||
| - **RealtimeModel / RealtimeSession** — Abstractions for streaming/realtime APIs (e.g., OpenAI Realtime). | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Tool System (`tool_context.ts`) | ||||||||||||||||||
|
|
||||||||||||||||||
| - `tool({ description, parameters, execute })` — Factory function. Parameters accept Zod v3, Zod v4, or raw JSON Schema. | ||||||||||||||||||
| - `handoff({ agent, returns })` — Return from tool to transfer to another agent. | ||||||||||||||||||
| - **Symbol-based type markers**: Tools use private symbols (`TOOL_SYMBOL`, `FUNCTION_TOOL_SYMBOL`, etc.) for runtime discrimination — prevents spoofing. | ||||||||||||||||||
| - **ToolOptions**: Tools receive `{ ctx: RunContext<UserData>, toolCallId, abortSignal }`. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Provider Format Adapters (`provider_format/`) | ||||||||||||||||||
|
|
||||||||||||||||||
| Three formats: `'openai'`, `'openai.responses'`, `'google'`. | ||||||||||||||||||
|
|
||||||||||||||||||
| - **`groupToolCalls()`** — Core algorithm shared by all adapters. Groups assistant messages with their tool calls and outputs by ID/groupId. | ||||||||||||||||||
| - **OpenAI**: Standard chat completions format with `tool_calls` array and `tool` role responses. | ||||||||||||||||||
| - **Google**: Turn-based with parts array. System messages extracted separately. Injects dummy user message (`.`) if last turn isn't user (Gemini requirement). Preserves `thoughtSignature` for thinking-mode models. | ||||||||||||||||||
| - **Image caching**: `ImageContent._cache` stores serialized versions to avoid re-encoding across provider conversions. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **Chronological insertion**: `ChatContext` maintains sorted order by `createdAt`. Late-arriving items (e.g., streamed chunks with timestamps) are inserted in correct position. | ||||||||||||||||||
| - **LCS-based diff**: `computeChatCtxDiff()` uses longest common subsequence for minimal create/remove operations — used by `RemoteChatContext` for IPC sync. | ||||||||||||||||||
| - **RemoteChatContext**: Linked-list based context for incremental updates. Insert by previous item ID, convert back via `toChatCtx()`. | ||||||||||||||||||
| - **Zod dual-version**: `zod-utils.ts` auto-detects Zod v3 (`_def.typeName`) vs v4 (`_zod` property) and routes schema conversion accordingly. | ||||||||||||||||||
| - **FallbackAdapter**: Multi-LLM failover with availability tracking, recovery tasks, and `availability_changed` events. | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,14 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/metrics/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| Per-model/provider usage tracking and aggregation for billing and analytics. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Components | ||||||||||||||||||
|
|
||||||||||||||||||
| - **ModelUsageCollector** — Aggregates metrics by `provider:model` key. Handles both standard LLM metrics and RealtimeModel metrics (with token detail breakdowns: text, image, audio, cached). | ||||||||||||||||||
| - **Usage types**: `LLMModelUsage`, `TTSModelUsage`, `STTModelUsage`, `InterruptionModelUsage` — each with provider-specific fields. | ||||||||||||||||||
| - **`filterZeroValues()`** — Strips zero-valued fields from usage objects for clean JSON output. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **Session duration tracking**: Some models (xAI) bill by session duration rather than tokens — tracked in `sessionDurationMs`. | ||||||||||||||||||
| - **UsageCollector is deprecated** — use `ModelUsageCollector` instead. | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,17 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/stream/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| Low-level async stream composition primitives built on the Web Streams API (`ReadableStream`, `WritableStream`, `TransformStream`). | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Classes | ||||||||||||||||||
|
|
||||||||||||||||||
| - **StreamChannel<T, E>** — Bidirectional stream: write to it, read from it. `addStreamInput()` launches async reader loops to pipe external streams in. | ||||||||||||||||||
| - **DeferredReadableStream<T>** — Readable stream where the actual source is set later via `setSource()`. Supports detach/reattach. | ||||||||||||||||||
| - **MultiInputStream<T>** — Fan-in multiplexer: N dynamic inputs → 1 output. Inputs can be added/removed at runtime. Output stays open after all inputs end (waits for new inputs). | ||||||||||||||||||
| - **IdentityTransform<T>** — Pass-through `TransformStream` with `highWaterMark` set to `MAX_SAFE_INTEGER` to prevent backpressure. | ||||||||||||||||||
| - **mergeReadableStreams()** — Functional merge of N streams (adapted from Deno). If one errors, merged output closes. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **IdentityTransform high water mark**: Intentionally disables backpressure on both sides. This follows the Python agents `channel.py` pattern — needed for concurrent sources. | ||||||||||||||||||
| - **Reader lock cleanup**: TypeErrors from releasing already-released locks are caught and ignored throughout. This is intentional. | ||||||||||||||||||
| - **MultiInputStream resilience**: Errors in one input don't kill the output stream. Failed inputs are removed silently. | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,24 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/stt/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| Speech-to-text abstractions with streaming, VAD-based adapters, and automatic retry. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Classes | ||||||||||||||||||
|
|
||||||||||||||||||
| - **STT** — Abstract base. Subclasses implement `_recognize()` (one-shot) and `stream()` (streaming). Emits `metrics_collected` and `error` events. | ||||||||||||||||||
| - **SpeechStream** — Async iterable consuming audio frames via `pushFrame()`, yielding `SpeechEvent` objects. Handles audio resampling internally if sample rates don't match. | ||||||||||||||||||
| - **StreamAdapter** — Wraps a non-streaming STT + VAD to create a streaming interface. Buffers audio during speech, calls `recognize()` on end-of-speech. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Architecture | ||||||||||||||||||
|
|
||||||||||||||||||
| ``` | ||||||||||||||||||
| pushFrame() → AudioResampler (if needed) → AsyncIterableQueue → run() (provider impl) → output queue → consumer | ||||||||||||||||||
| ``` | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **Dual queue architecture**: Input queue, intermediate queue (for metrics monitoring), and output queue run concurrently. | ||||||||||||||||||
| - **FLUSH_SENTINEL**: Private static symbol signals flush operations internally without creating actual events. | ||||||||||||||||||
| - **startSoon() in constructor**: Defers `mainTask()` until after constructor completes to avoid accessing uninitialized fields. | ||||||||||||||||||
| - **Resampler created on-demand**: Only instantiated when first frame with different sample rate arrives. | ||||||||||||||||||
| - **Retry with exponential backoff**: `mainTask()` retries on `APIError`/`APIConnectionError`; other errors are immediately fatal. | ||||||||||||||||||
| - **startTimeOffset**: Can offset transcription timestamps for stream resumption. | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,16 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/telemetry/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| OpenTelemetry integration for distributed tracing, logging, and session report uploads. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Components | ||||||||||||||||||
|
|
||||||||||||||||||
| - **DynamicTracer** — Runtime-switchable tracer provider wrapper. Global instance exported as `tracer`. | ||||||||||||||||||
| - **setupCloudTracer()** — Complete cloud observability setup: OTLP exporter, metadata span processor, Pino cloud log exporter. Uses JWT for auth. | ||||||||||||||||||
| - **uploadSessionReport()** — Uploads chat history (JSON), metrics (protobuf header), and audio (OGG) to LiveKit Cloud via multipart FormData. | ||||||||||||||||||
| - **MetadataLogProcessor / ExtraDetailsProcessor** — Inject room_id, job_id, logger names into all log records. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **Monotonic timestamp ordering**: Session report adds 1μs offsets to colliding timestamps to ensure correct dashboard display ordering. | ||||||||||||||||||
| - **Dynamic tracer provider**: Can change tracer provider mid-session (used when cloud connection establishes after startup). | ||||||||||||||||||
| - **Metadata injection**: All spans automatically tagged with room_id and job_id via `MetadataSpanProcessor`. | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,15 @@ | ||||||||||||||||||
| # CLAUDE.md | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing SPDX license header on new file agents/src/tokenize/CLAUDE.md New file lacks the SPDX license header required by CONTRIBUTING.md and CLAUDE.md Code Conventions. Not covered by any
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||
|
|
||||||||||||||||||
| Streaming text tokenization for real-time TTS. Incrementally splits text into sentences or words with configurable buffering. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Key Classes | ||||||||||||||||||
|
|
||||||||||||||||||
| - **SentenceTokenizer / WordTokenizer** — Abstract bases with `tokenize()` (batch) and `stream()` (streaming) methods. | ||||||||||||||||||
| - **BufferedTokenStream** — Core streaming implementation. Buffers input until `minContextLength`, then tokenizes and holds output until `minTokenLength` before emitting. Each `flush()` generates a new `segmentId`. | ||||||||||||||||||
| - **Basic implementations** (`basic/`) — Default English tokenizers using rule-based sentence/word splitting. Includes hyphenation support. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Non-Obvious Patterns | ||||||||||||||||||
|
|
||||||||||||||||||
| - **Designed for TTS pipeline**: Text arrives incrementally from LLM streaming. Tokenizer buffers enough context for accurate sentence boundaries before emitting. | ||||||||||||||||||
| - **Tuple tokens**: Some tokenizers return `[text, startPos, endPos]` tuples for position tracking, not just strings. | ||||||||||||||||||
| - **Segment tracking**: `flush()` creates new segment IDs, allowing consumers to distinguish continuous speech from intentional breaks. | ||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Missing SPDX license header on new file CLAUDE.md (root)
The root
CLAUDE.mdis a new file but lacks the SPDX license header required by CONTRIBUTING.md ("When creating a new file, make sure to add SPDX headers for REUSE-3.2 compliance") and by the CLAUDE.md Code Conventions section itself. All existing markdown files in the repo (e.g.,CONTRIBUTING.md:1-5,README.md:1-5,CODE_OF_CONDUCT.md:1-5) include an HTML-comment SPDX header. Additionally,REUSE.tomlhas no annotation covering**/CLAUDE.md, so these files will fail the REUSE/SPDX CI check.Prompt for agents
Was this helpful? React with 👍 or 👎 to provide feedback.