Introducing Nova-2: The Fastest, Most Accurate Speech-to-Text API
Deepgram introduces Nova-2, a next-generation speech-to-text model that outperforms alternatives in terms of accuracy, speed, and cost. Nova-2 is 18% more accurate than its predecessor and offers a 36% relative WER improvement over OpenAI Whisper (large). It delivers an average 30% reduction in word error rate (WER) over competitors for both pre-recorded and real-time transcription, with 5-40x faster pre-recorded inference time. Nova-2 is priced at $0.0043/min for pre-recorded audio, making it more affordable than other full-functionality providers. The model has been trained on a diverse dataset and offers improved entity accuracy, punctuation accuracy, and capitalization error rate compared to Nova-1. Deepgram's benchmarking methodology uses over 50 hours of human-annotated audio across various domains and compares Nova-2 with other prominent models in the market.
Company
Deepgram
Date published
Sept. 19, 2023
Author(s)
Josh Fox
Word count
2281
Language
English
Hacker News points
2