NVIDIA Parakeet v2 is a 600M parameter speech-to-text model for high-quality English speech recognition. It supports punctuation, capitalization, and detailed word-level timestamping.
Reach for it when you need accurate English transcription with precise word-level timing, for example to align captions or index spoken content by timestamp.