GraphDone-TTS

A high-performance, production-ready text-to-speech server built with Piper-TTS, providing OpenAI-compatible API endpoints for GraphDone applications.

🚀 Features

OpenAI-Compatible API: Drop-in replacement for OpenAI's TTS endpoints
High Performance: Built with FastAPI and optimized for speed
Multiple Voices: Support for 6+ voices with multiple quality levels
Web Interface: Interactive UI for testing and configuration
Smart Caching: Intelligent LRU+LFU cache system with 10GB limit
Docker Ready: Single or multi-container deployment options
Format Support: MP3, WAV, OPUS, AAC, FLAC, PCM output formats
Rate Limiting: Built-in protection against abuse
Batch Processing: Generate multiple voices in parallel

📁 Project Structure

GraphDone-TTS/
├── src/                      # Source code
│   ├── piper-server/        # FastAPI TTS server
│   └── webui/               # Flask web interface
├── docker/                   # Docker configurations
│   ├── Dockerfile.single    # Single container build
│   └── docker-compose.*.yml # Compose configurations
├── scripts/                  # Automation scripts
│   ├── setup_tts.sh         # Basic setup
│   ├── setup_tts_with_ui.sh # Full setup with UI
│   └── download_voices.sh   # Voice model downloader
├── config/                   # Configuration files
│   ├── voice_to_speaker.yaml
│   └── pre_process_map.yaml
├── voices/                   # ONNX voice models
├── tests/                    # Test files and scripts
├── docs/                     # Documentation
└── examples/                 # Usage examples

🔧 Installation

Quick Start (Recommended)

# Clone the repository
git clone https://github.com/graphdone/GraphDone-TTS.git
cd GraphDone-TTS

# Just run start - it handles everything automatically!
./start

That's it! The ./start script will:

✅ Install Docker if needed
✅ Download voice models
✅ Build containers
✅ Start all services
✅ Show you the URLs

Management Commands

./start          # Start everything (auto-setup)
./start stop     # Stop all services
./start restart  # Restart services
./start logs     # View logs
./start status   # Check status
./start clean    # Clean up everything

📖 API Usage

Generate Speech

curl -X POST "http://localhost:8000/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello, this is GraphDone TTS!",
    "voice": "nova",
    "response_format": "mp3"
  }' \
  --output speech.mp3

Available Voices

alloy - Neutral, professional
echo - Warm, conversational
fable - Expressive, narrative
onyx - Deep, authoritative
nova - Energetic, youthful
shimmer - Gentle, soothing

Supported Formats

mp3 - Default, compressed audio
wav - Uncompressed, high quality
opus - Efficient compression
aac - Apple-compatible
flac - Lossless compression
pcm - Raw audio data

🖥️ Web Interface

Access the web UI at http://localhost:3000

Features:

Test different voices and settings
Batch generation for multiple voices
Real-time audio playback
Cache management dashboard
API endpoint testing

🐳 Docker Deployment

Build Custom Image

# Build complete image with all voices
./scripts/package_single.sh

# Run the container
docker run -d -p 8000:8000 -p 3000:3000 \
  --name graphdone-tts \
  tts-server-complete:latest

Docker Compose Options

# Development (multi-container)
docker-compose -f docker/docker-compose.yml up

# Production (single container)
docker-compose -f docker/docker-compose.single.yml up

⚙️ Configuration

Voice Configuration

Edit config/voice_to_speaker.yaml to customize voice mappings:

nova:
  low: en_US-amy-low
  medium: en_US-amy-medium
  high: en_US-amy-medium
  x_low: en_US-amy-low

Text Preprocessing

Customize text processing in config/pre_process_map.yaml:

abbreviations:
  "Mr.": "Mister"
  "Dr.": "Doctor"

Environment Variables

# Cache settings
CACHE_DIR=/app/output/cache
MAX_CACHE_SIZE_GB=10

# API configuration
TTS_API_URL=http://localhost:8000
SECRET_KEY=your-secret-key

# Performance
MAX_WORKERS=8

🧪 Testing

Run the comprehensive test suite:

# Run all tests
./start test

# Test specific component
./tests/test_tts.sh

# Manual API test
curl http://localhost:8000/health

📊 Performance

Response Time: < 500ms for cached content
Generation Speed: 2-5 seconds for new content
Concurrent Requests: Handles 100+ simultaneous requests
Cache Hit Rate: 70%+ in production
Memory Usage: < 2GB under normal load

🔒 Security

Input validation and sanitization
Rate limiting on all endpoints
Path traversal protection
Non-root container execution
Secure file handling

📚 Documentation

🤝 Contributing

Contributions are welcome! Please read our contributing guidelines before submitting PRs.

📄 License

This project is licensed under the MIT License - see LICENSE file for details.

Third-Party Licenses

Piper-TTS: MIT License - github.com/rhasspy/piper
OpenAI API Specification: MIT License

🙏 Acknowledgments

Built with Piper-TTS - A fast, local neural text-to-speech system (MIT License)
Piper-TTS models and voice synthesis technology by Michael Hansen
OpenAI API compatibility for seamless integration
GraphDone team for project support

📞 Support

For issues and questions:

GitHub Issues: GraphDone-TTS/issues
Documentation: docs/

Made with ❤️ by the GraphDone Team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GraphDone-TTS

🚀 Features

📁 Project Structure

🔧 Installation

Quick Start (Recommended)

Management Commands

📖 API Usage

Generate Speech

Available Voices

Supported Formats

🖥️ Web Interface

🐳 Docker Deployment

Build Custom Image

Docker Compose Options

⚙️ Configuration

Voice Configuration

Text Preprocessing

Environment Variables

🧪 Testing

📊 Performance

🔒 Security

📚 Documentation

🤝 Contributing

📄 License

Third-Party Licenses

🙏 Acknowledgments

📞 Support

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

GraphDone-TTS

🚀 Features

📁 Project Structure

🔧 Installation

Quick Start (Recommended)

Management Commands

📖 API Usage

Generate Speech

Available Voices

Supported Formats

🖥️ Web Interface

🐳 Docker Deployment

Build Custom Image

Docker Compose Options

⚙️ Configuration

Voice Configuration

Text Preprocessing

Environment Variables

🧪 Testing

📊 Performance

🔒 Security

📚 Documentation

🤝 Contributing

📄 License

Third-Party Licenses

🙏 Acknowledgments

📞 Support