Lower latency, lower cost, more possibilities

Company

AssemblyAI

Date Published

Jan. 10, 2024

Author

Ryan O'Connor

Word count

1008

Language

English

Hacker News points

URL

www.assemblyai.com/blog/lower-latency-new-pricing

Summary

AssemblyAI has introduced major improvements in their API's inference latency, making the majority of audio files complete within well under 45 seconds regardless of audio duration and with a Real-Time-Factor (RTF) as low as .008x. These advancements have been implemented without any compromise on accuracy, as evidenced by their Conformer-2 model achieving an industry-leading average Word Error Rate (WER) at approximately 6%. AssemblyAI has achieved this through intelligent mini batching, hardware parallelization and optimized serving infrastructure. This results in reduced pricing for both async ($0.37 per hour) and real-time ($0.47 per hour) speech-to-text models. The company also plans to release more updates over the next few months.