Discrete Optimal Transport and Voice Conversion
By: Anton Selitskiy, Maitreya Kocharekar
Potential Business Impact:
Changes one person's voice to sound like another.
In this work, we address the voice conversion (VC) task using a vector-based interface. To align audio embeddings between speakers, we employ discrete optimal transport mapping. Our evaluation results demonstrate the high quality and effectiveness of this method. Additionally, we show that applying discrete optimal transport as a post-processing step in audio generation can lead to the incorrect classification of synthetic audio as real.
Similar Papers
Training-Free Voice Conversion with Factorized Optimal Transport
Sound
Changes voices to sound like someone else.
O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
Sound
Changes your voice to sound like anyone.
Discrete optimal transport is a strong audio adversarial attack
Audio and Speech Processing
Makes fake voices sound real to computers.