Skip to content

Latest commit

 

History

History
31 lines (24 loc) · 1.88 KB

File metadata and controls

31 lines (24 loc) · 1.88 KB

Deployment Targets and Scaling

Voice Creator Studio packages all services into Docker containers and targets cloud environments that support container orchestration.

Deployment Targets

  • Containers: Each service (FastAPI backend, React frontend, Celery workers, Redis) runs in a dedicated Docker image.
  • Cloud Services: Production deployments run on Kubernetes (e.g., Amazon EKS or Google GKE). Object storage uses services like Amazon S3; PostgreSQL or Aurora host persistent data; Redis is provided by a managed service such as AWS ElastiCache.
  • Local Development: Docker Compose starts the full stack on a single host for developers.

Scaling Strategy

Web Requests

  • FastAPI instances are replicated behind a load balancer (NGINX or a cloud load balancer).
  • Horizontal Pod Autoscalers adjust replica counts based on CPU and memory utilization.
  • Static assets from the React build are served via a CDN to offload traffic from the backend.

Background Jobs

  • Celery workers run in separate deployments that can scale independently from the web layer.
  • Queue depth and processing time drive autoscaling decisions; GPU jobs use dedicated nodes with limited concurrency.
  • Workers are labeled by capability (CPU vs. GPU) so tasks are routed to matching queues.

Hardware Requirements

Training

  • GPU: NVIDIA GPU with at least 16 GB VRAM (e.g., Tesla V100/A100). Multi-GPU setups improve throughput for large voices.
  • CPU/RAM: 8+ vCPUs and 32 GB RAM minimum to feed the GPU and handle preprocessing.
  • Storage: 500 GB SSD for datasets, checkpoints, and intermediate artifacts.

Inference

  • GPU (optional): 8 GB VRAM (e.g., T4) accelerates real-time synthesis, but CPU-only inference is possible for smaller models.
  • CPU/RAM: 4 vCPUs and 8–16 GB RAM support typical interactive workloads.
  • Storage: 50 GB SSD accommodates models and cached assets.