/plushcap/analysis/deepgram/child-to-adult-voice-style-transfer-auditory-ai-case-study

Child-to-Adult Voice Style Transfer: A Case Study in Auditory AI

What's this blog post about?

In a case study on auditory AI, an independent project at Stanford explored child-to-adult voice style transfer using state-of-the-art models. The research found that even the best voice style transfer pipelines had difficulty handling child inputs, despite impressive results in adult-to-adult conversions. Three different models were used: a classic voice cloning architecture, a few-shot AutoVC architecture, and a traditional many-to-many voice conversion model. However, none of these models produced satisfactory results for child-to-adult voice style transfer. The study highlights the complexities of audio processing and machine learning research in this area.

Company
Deepgram

Date published
Jan. 24, 2024

Author(s)
Jose Nicholas Francisco

Word count
1736

Language
English

Hacker News points
None found.