Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform
By: Yuankun Xie , Ruibo Fu , Xiaopeng Wang and more
Potential Business Impact:
Catches fake voices made by computers.
The rapid advancement of speech generation technology has led to the widespread proliferation of deepfake speech across social media platforms. While deepfake audio countermeasures (CMs) achieve promising results on public datasets, their performance degrades significantly in cross-domain scenarios. To advance CMs for real-world deepfake detection, we first propose the Fake Speech Wild (FSW) dataset, which includes 254 hours of real and deepfake audio from four different media platforms, focusing on social media. As CMs, we establish a benchmark using public datasets and advanced selfsupervised learning (SSL)-based CMs to evaluate current CMs in real-world scenarios. We also assess the effectiveness of data augmentation strategies in enhancing CM robustness for detecting deepfake speech on social media. Finally, by augmenting public datasets and incorporating the FSW training set, we significantly advanced real-world deepfake audio detection performance, achieving an average equal error rate (EER) of 3.54% across all evaluation sets.
Similar Papers
SocialDF: Benchmark Dataset and Detection Model for Mitigating Harmful Deepfake Content on Social Media Platforms
Machine Learning (CS)
Finds fake videos and voices online.
EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection
Audio and Speech Processing
Stops fake voices from tricking people over the phone.
Multilingual Dataset Integration Strategies for Robust Audio Deepfake Detection: A SAFE Challenge System
Audio and Speech Processing
Finds fake voices in recordings.