Score: 0

Voice Cloning: Comprehensive Survey

Published: May 1, 2025 | arXiv ID: 2505.00579v1

By: Hussam Azzuni, Abdulmotaleb El Saddik

Potential Business Impact:

Makes computers sound like anyone with a few words.

Business Areas:

Speech Recognition Data and Analytics, Software

Voice Cloning has rapidly advanced in today's digital world, with many researchers and corporations working to improve these algorithms for various applications. This article aims to establish a standardized terminology for voice cloning and explore its different variations. It will cover speaker adaptation as the fundamental concept and then delve deeper into topics such as few-shot, zero-shot, and multilingual TTS within that context. Finally, we will explore the evaluation metrics commonly used in voice cloning research and related datasets. This survey compiles the available voice cloning algorithms to encourage research toward its generation and detection to limit its misuse.

DS-TTS: Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation

Sound

Makes computers copy any voice from one sample.

1 Jun 2025 1

88%

ClonEval: An Open Voice Cloning Benchmark

Computation and Language

Tests how well computers copy voices.

29 Apr 2025 0

88%

Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology

Sound

Makes computers talk like people with speech problems.

3 Mar 2025 0

View PDF Login to Bookmark

Page Count

26 pages

Voice Cloning: Comprehensive Survey

Makes computers sound like anyone with a few words.

Technical Abstract

DS-TTS: Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation

ClonEval: An Open Voice Cloning Benchmark

Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology