Score: 0

Discrete Optimal Transport and Voice Conversion

Published: May 7, 2025 | arXiv ID: 2505.04382v2

By: Anton Selitskiy, Maitreya Kocharekar

Potential Business Impact:

Changes one person's voice to sound like another.

Business Areas:
Speech Recognition Data and Analytics, Software

In this work, we address the voice conversion (VC) task using a vector-based interface. To align audio embeddings between speakers, we employ discrete optimal transport mapping. Our evaluation results demonstrate the high quality and effectiveness of this method. Additionally, we show that applying discrete optimal transport as a post-processing step in audio generation can lead to the incorrect classification of synthetic audio as real.

Page Count
4 pages

Category
Electrical Engineering and Systems Science:
Audio and Speech Processing