Child-to-Adult Voice Style Transfer: A Case Study in Auditory AI

Company

Deepgram

Date Published

Jan. 24, 2024

Author

Jose Nicholas Francisco

Word count

1736

Language

English

Hacker News points

None

URL

deepgram.com/learn/child-to-adult-voice-style-transfer-auditory-ai-case-study

Summary

In a case study on auditory AI, an independent project at Stanford explored child-to-adult voice style transfer using state-of-the-art models. The research found that even the best voice style transfer pipelines had difficulty handling child inputs, despite impressive results in adult-to-adult conversions. Three different models were used: a classic voice cloning architecture, a few-shot AutoVC architecture, and a traditional many-to-many voice conversion model. However, none of these models produced satisfactory results for child-to-adult voice style transfer. The study highlights the complexities of audio processing and machine learning research in this area.