Skip to content

Fonada TTS plugin#5171

Open
Pragyanshu-Fonada wants to merge 15 commits intolivekit:mainfrom
Pragyanshu-Fonada:feature/fonadalabs_livekit_agent
Open

Fonada TTS plugin#5171
Pragyanshu-Fonada wants to merge 15 commits intolivekit:mainfrom
Pragyanshu-Fonada:feature/fonadalabs_livekit_agent

Conversation

@Pragyanshu-Fonada
Copy link

@Pragyanshu-Fonada Pragyanshu-Fonada commented Mar 20, 2026

Summary

Adds a new TTS plugin livekit-plugins-fonadalabs for FonadaLabs API —
a high-quality text-to-speech service specializing in Indian languages.

Features

  • WebSocket-based streaming TTS
  • Dynamic language/voice catalog fetched
  • Supports Hindi (70 voices), Tamil (16 voices), Telugu (60 voices), English (70 voices)
  • Language can be specified by code ("hi") or display name ("Hindi")
  • Graceful fallback if catalog is unavailable

Environment Variable

  • FONADALABS_API_KEY — FonadaLabs API key

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 8 additional findings in Devin Review.

Open in Devin Review

"voice_id": self._resolved_voice,
"language": self._resolved_lang_name, # display name, e.g. "Hindi"
}
await ws.send_str(json.dumps(payload))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing _mark_started() call prevents TTS metrics from being emitted

The fonadalabs SynthesizeStream._run() never calls self._mark_started(), so self._started_time remains 0. In the base class _metrics_monitor_task at livekit-agents/livekit/agents/tts/tts.py:539, the check if not self._started_time causes _emit_metrics() to return early, meaning no TTS metrics (TTFB, duration, audio duration, etc.) are ever emitted for this plugin. Every other streaming TTS plugin in the repository (Sarvam, Cartesia, Deepgram, ElevenLabs, Google, etc.) calls self._mark_started() before or when sending input to the TTS service.

Suggested change
await ws.send_str(json.dumps(payload))
self._mark_started()
await ws.send_str(json.dumps(payload))
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

if not segments:
raise ValueError("No text received from input channel.")

text = " ".join(segments)
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 " ".join(segments) introduces spurious spaces between LLM tokens

LLM tokens pushed via push_text() already contain their own whitespace (e.g. "Hello", " world", "!"). The base class itself concatenates them without spaces at livekit-agents/livekit/agents/tts/tts.py:593 (self._pushed_text += token). Using " ".join(segments) on line 288 inserts an extra space between every token, producing text like "Hello world !" instead of "Hello world!". This corrupts the text sent to the TTS API, degrading speech quality. Should use "".join(segments) instead.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 9 additional findings in Devin Review.

Open in Devin Review

f"[FonadaLabs] Could not load catalog from {FONADALABS_SUPPORTED_VOICES_URL}: {exc}. "
"Language/voice validation will be skipped — server will validate instead."
)
_catalog_cache = _Catalog(voices={}, code_to_name={}, name_to_code={})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Failed catalog fetch is permanently cached, preventing recovery after transient errors

When _load_catalog fails (e.g., due to a transient network error), it caches an empty _Catalog at line 123. Because subsequent calls check if _catalog_cache is not None at line 77 and return the cached empty catalog immediately, the plugin never retries fetching the catalog. The _invalidate_catalog() function exists but is only called for specific TTS server error types (unsupported_voice, invalid_language). If the user's input happens to be valid (e.g., language="Hindi", voice="Vaanee"), the server may accept the request and _invalidate_catalog is never triggered, so the empty catalog persists for the entire process lifetime. This means client-side language/voice validation is permanently disabled after a single transient failure at startup.

Prompt for agents
In livekit-plugins/livekit-plugins-fonadalabs/livekit/plugins/fonadalabs/tts.py, the _load_catalog function at line 118-123 caches an empty _Catalog on failure. Instead, it should NOT cache the result on failure so that subsequent calls can retry. Change line 123 from caching an empty catalog to returning a temporary empty catalog without setting _catalog_cache. For example, remove the assignment to _catalog_cache in the except block and instead return a fresh empty _Catalog(voices={}, code_to_name={}, name_to_code={}) directly. This way, the next call to _load_catalog will retry the API fetch. To prevent retry storms, consider adding an asyncio.Lock and/or a backoff timer.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@Pragyanshu-Fonada Pragyanshu-Fonada changed the title Feature/fonadalabs livekit agent Fonada TTS plugin Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant