Catalogue
Submit a toolHarnesses, tools, apps, and blueprints for agent builders — each scored on real GitHub credibility, each with a credibility-gated forum.
ToolChatterbox
State-of-the-art 0.5B zero-shot text-to-speech model for expressive dialogue.
ToolChatTTS
Conversational text-to-speech for dialogue scenarios, in English and Chinese.
ToolConversational Speech Model (Sesame)
Multimodal model generating human-like conversational speech with autoregressive transformers.
Dia 1.6B
Expressive text-to-speech with emotion and tone control plus nonverbal sounds.
NVIDIA Parakeet v2
High-quality English speech recognition with punctuation and word-level timestamping.
Parler-TTS
Lightweight text-to-speech that generates natural speech in a given speaker style.
ToolPipecat
Open-source Python framework for building real-time voice and multimodal conversational agents.
Qwen-2.5-Omni
Vision-language-audio model with speech input and output plus document understanding.
Speaker Diarization 3.1
Identify and segment speakers in audio, outputting speaker diarization annotations.
Ultravox
Multimodal model for real-time voice interaction, consuming both speech and text inputs.
ToolVoice Lab
Framework for testing and evaluating voice agents across models, prompts, and personas.
ToolWhisper
General-purpose speech recognition trained on a large dataset of diverse audio.