Child-to-Adult Voice Style Transfer: A Case Study in Auditory AI
In a case study on auditory AI, an independent project at Stanford explored child-to-adult voice style transfer using state-of-the-art models. The research found that even the best voice style transfer pipelines had difficulty handling child inputs, despite impressive results in adult-to-adult conversions. Three different models were used: a classic voice cloning architecture, a few-shot AutoVC architecture, and a traditional many-to-many voice conversion model. However, none of these models produced satisfactory results for child-to-adult voice style transfer. The study highlights the complexities of audio processing and machine learning research in this area.
Company
Deepgram
Date published
Jan. 24, 2024
Author(s)
Jose Nicholas Francisco
Word count
1736
Hacker News points
None found.
Language
English