Company
Date Published
Dec. 23, 2024
Author
Hume Research
Word count
984
Language
English
Hacker News points
None

Summary

OCTAVE (Omni-Capable Text and Voice Engine) is a next-generation speech-language model that combines the capabilities of various systems, including OpenAI's Voice Engine, Elevenlab's TTS Voice Design, and Google Deepmind's NotebookLM. It can generate voices and personalities from prompts, clone voices from recordings in one step, interact with users in real-time, and create multiple interacting characters. OCTAVE maintains comparable performance to similar-sized LLMs on language understanding tasks while ensuring its intelligence is coherent and nuanced. The model is currently being evaluated for safety and effectiveness by trusted partners before broader availability is rolled out in the coming months.