Skip to content

fix(voice): prepend silence to TTS audio to prevent first-word clipping#29

Merged
DannyNs merged 1 commit intomainfrom
bug/tts-first-words-cutoff
Mar 19, 2026
Merged

fix(voice): prepend silence to TTS audio to prevent first-word clipping#29
DannyNs merged 1 commit intomainfrom
bug/tts-first-words-cutoff

Conversation

@DannyNs
Copy link
Collaborator

@DannyNs DannyNs commented Mar 19, 2026

Summary

  • PulseAudio's module-suspend-on-idle suspends the WSLg RDPSink after inactivity. When new audio starts, the sink takes ~200-500ms to wake up, clipping the first words of speech.
  • Added prependSilence() function that uses ffmpeg's adelay filter to prepend 300ms of silence to the MP3 before playback
  • Applied to both single-shot and first-chunk of chunked TTS paths
  • Falls back gracefully to unpadded audio if ffmpeg is unavailable

Test plan

  • Run voice_speak with short text — verify first words are audible
  • Run voice_speak with long text (chunked path) — verify first words of first chunk are audible
  • Verify no regression on subsequent chunks (no extra silence between chunks)
  • Kill PulseAudio, run speak again — verify graceful fallback

🤖 Generated with Claude Code

…clipping

PulseAudio's module-suspend-on-idle suspends the WSLg RDPSink after
inactivity. When new audio starts playing, the sink takes ~200-500ms
to wake up, causing the first words to be clipped.

Use ffmpeg's adelay filter to prepend 300ms of silence to the MP3
before playback. Applied to both single-shot and first-chunk of
chunked TTS. Falls back gracefully if ffmpeg is unavailable.
@DannyNs DannyNs merged commit cd42e1b into main Mar 19, 2026
1 check passed
@DannyNs DannyNs deleted the bug/tts-first-words-cutoff branch March 19, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant