non-autoregressive

Star

Here are 36 public repositories matching this topic...

lucidrains / soundstorm-pytorch

Star

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

deep-learning transformers artificial-intelligence attention-mechanism non-autoregressive audio-generation

Updated Apr 24, 2025
Python

shivammehta25 / Matcha-TTS

Star

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

machine-learning text-to-speech deep-learning tts probabilistic tts-engines tts-api diffusion-model diffusion-models non-autoregressive probabilistic-machine-learning flow-matching

Updated Jan 19, 2026
Jupyter Notebook

ictnlp / StreamSpeech

Star

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Updated Jun 29, 2025
Python

keonlee9420 / DiffGAN-TTS

Star

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

text-to-speech deep-neural-networks pytorch tts speech-synthesis gan generative-model diffusion diffusion-models neural-tts non-autoregressive fastspeech multi-speaker-tts hifi-gan ddpm non-ar diffspeech diffgan-tts single-speaker-tts

Updated Feb 21, 2022
Python

keonlee9420 / PortaSpeech

Star

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

text-to-speech deep-neural-networks pytorch tts speech-synthesis generative-model vae normalizing-flows high-quality neural-tts non-autoregressive fastspeech hifi-gan non-ar mel-gan portable-tts

Updated Feb 17, 2022
Python

keonlee9420 / Comprehensive-Transformer-TTS

Star

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

Updated Sep 24, 2022
Python

keonlee9420 / Expressive-FastSpeech2

Star

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

text-to-speech tts speech-synthesis expressive-speech-synthesis non-autoregressive emotional-tts korean-tts expressive-tts emotional-speech-synthesis korean-speech-synthesis conversational-tts conversational-speech-synthesis

Updated Aug 25, 2021
Python

keonlee9420 / DailyTalk

Star

Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023

text-to-speech pytorch tts speech-synthesis dataset conversational-ai non-autoregressive conversational-data tts-dataset conversational-tts

Updated Jun 5, 2025
Python

keonlee9420 / DiffSinger

Star

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)

text-to-speech pytorch tts speech-synthesis english diffusion singing-voice diffusion-models neural-tts non-autoregressive fastspeech ddpm diffsinger

Updated Feb 3, 2022
Python

HKUNLP / diffusion-of-thoughts

Star

[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"

machine-learning natural-language-processing text-generation pytorch diffusion-models non-autoregressive mathematical-reasoning chain-of-thought-reasoning diffusion-lm

Updated Mar 4, 2025
Python

keonlee9420 / StyleSpeech

Star

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

text-to-speech style pytorch tts speech-synthesis english speaker prosody meta-learning one-shot speaker-adaptation neural-tts non-autoregressive fastspeech speech-style stylespeech meta-stylespeech unseen-speaker

Updated Feb 10, 2022
Python

keonlee9420 / Cross-Speaker-Emotion-Transfer

Star

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

text-to-speech deep-neural-networks pytorch tts speech-synthesis generative-model semi-supervised-learning global-style-tokens neural-tts non-autoregressive parallel-tacotron non-ar emotion-transfer cross-speaker conditional-layer-normalization

Updated Nov 9, 2022
Python

keonlee9420 / Parallel-Tacotron2

Star

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

text-to-speech duration pytorch tts speech-synthesis english vae self-attention neural-tts non-autoregressive fastspeech parallel-tacotron parallel-tacotron2

Updated Nov 18, 2021
Python

xcfcode / What-I-Have-Read

Star

Paper Lists, Notes and Slides, Focus on NLP. For summarization, please refer to https://github.com/xcfcode/Summarization-Papers

Updated Jun 12, 2022

keonlee9420 / Comprehensive-E2E-TTS

Star

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS