Whisper is a general-purpose speech-to-text model trained on a large dataset of diverse audio. It performs multilingual speech recognition, speech translation, and language identification.
Reach for it when you need broad, general-purpose transcription across many languages, or when you also need speech translation and language identification from the same model.
