Score: 0

FakeSound2: A Benchmark for Explainable and Generalizable Deepfake Sound Detection

Published: September 21, 2025 | arXiv ID: 2509.17162v1

By: Zeyu Xie , Yaoyun Zhang , Xuenan Xu and more

Potential Business Impact:

Finds fake voices in recordings.

Business Areas:
Speech Recognition Data and Analytics, Software

The rapid development of generative audio raises ethical and security concerns stemming from forged data, making deepfake sound detection an important safeguard against the malicious use of such technologies. Although prior studies have explored this task, existing methods largely focus on binary classification and fall short in explaining how manipulations occur, tracing where the sources originated, or generalizing to unseen sources-thereby limiting the explainability and reliability of detection. To address these limitations, we present FakeSound2, a benchmark designed to advance deepfake sound detection beyond binary accuracy. FakeSound2 evaluates models across three dimensions: localization, traceability, and generalization, covering 6 manipulation types and 12 diverse sources. Experimental results show that although current systems achieve high classification accuracy, they struggle to recognize forged pattern distributions and provide reliable explanations. By highlighting these gaps, FakeSound2 establishes a comprehensive benchmark that reveals key challenges and aims to foster robust, explainable, and generalizable approaches for trustworthy audio authentication.

Page Count
5 pages

Category
Computer Science:
Sound