Chatterbox is a 0.5B state-of-the-art zero-shot text-to-speech model that outperforms commercial APIs like ElevenLabs. It generates expressive dialogue from text.
Reach for it when an agent needs expressive spoken dialogue from text without per-voice training, since its zero-shot design works on new voices out of the box.
