Trained on 100,000+ Voices: Deepgram Unveils Next-Gen Speaker Diarization and Language Detection Models

Company

Deepgram

Date Published

May 11, 2023

Author

Josh Fox

Word count

2158

Language

English

Hacker News points

None

URL

deepgram.com/learn/nextgen-speaker-diarization-and-language-detection-models

Summary

Deepgram has released a new speaker diarization model that offers best-in-class accuracy and processes audio 10 times faster than its nearest competitor. The language-agnostic diarization model is free with all of the company's automatic speech recognition (ASR) models, including Nova and Whisper. Deepgram has also revamped its automatic language detection feature, resulting in a relative error rate improvement of up to 54.7% on high-demand languages such as English, Spanish, Hindi, and German. The company's large-scale multilingual training approach enables it to employ fast and lean networks while still obtaining world-class accuracy. Deepgram's diarization feature outperforms many commercial diarization models and common open-source alternatives like Pyannote when dealing with domain-specific, real-world data.