Voice Cloning: Comprehensive Survey
By: Hussam Azzuni, Abdulmotaleb El Saddik
Potential Business Impact:
Makes computers sound like anyone with a few words.
Voice Cloning has rapidly advanced in today's digital world, with many researchers and corporations working to improve these algorithms for various applications. This article aims to establish a standardized terminology for voice cloning and explore its different variations. It will cover speaker adaptation as the fundamental concept and then delve deeper into topics such as few-shot, zero-shot, and multilingual TTS within that context. Finally, we will explore the evaluation metrics commonly used in voice cloning research and related datasets. This survey compiles the available voice cloning algorithms to encourage research toward its generation and detection to limit its misuse.
Similar Papers
DS-TTS: Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation
Sound
Makes computers copy any voice from one sample.
ClonEval: An Open Voice Cloning Benchmark
Computation and Language
Tests how well computers copy voices.
Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology
Sound
Makes computers talk like people with speech problems.