Octave TTS Prompting Guide

Company

Hume

Date Published

Feb. 26, 2025

Author

Word count

1141

Language

English

Hacker News points

None

URL

www.hume.ai/blog/octave-tts-prompting-guide

Summary

Octave Text-to-Speech (TTS) is a state-of-the-art voice AI model that enables the interpretation of text meaning and can be customized for any character, guiding emotional delivery and bringing stories to life with human-like expression. The speech-language model (speech LM) is trained on data capturing nuances of human vocal expression and can interpret plot twists, emotional cues, and character traits within a script or prompt, transforming them into lifelike speech. To create the best possible samples and leverage the capabilities of this speech LM, users should craft their prompts with consideration for semantic alignment and character match, experiment with punctuation, incorporate emotions, and develop detailed characters for their voice. The tool also supports multilingual prompting, currently in English and Spanish, with plans to support more languages in the coming weeks.