Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform

Yuankun Xie; Ruibo Fu; Xiaopeng Wang; Zhiyong Wang; Ya Li; Zhengqi Wen; Haonnan Cheng; Long Ye

arXiv:2508.10559·cs.SD·August 15, 2025

Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform

Yuankun Xie, Ruibo Fu, Xiaopeng Wang, Zhiyong Wang, Ya Li, Zhengqi Wen, Haonnan Cheng, Long Ye

PDF

TL;DR

This paper introduces the Fake Speech Wild dataset and benchmarks self-supervised learning methods to improve deepfake speech detection on social media, achieving a low error rate in real-world scenarios.

Contribution

It presents the FSW dataset for real-world deepfake speech detection and evaluates SSL-based countermeasures with data augmentation to enhance robustness.

Findings

01

Achieved an average EER of 3.54% in real-world detection

02

Demonstrated the effectiveness of data augmentation strategies

03

Established a benchmark for social media deepfake speech detection

Abstract

The rapid advancement of speech generation technology has led to the widespread proliferation of deepfake speech across social media platforms. While deepfake audio countermeasures (CMs) achieve promising results on public datasets, their performance degrades significantly in cross-domain scenarios. To advance CMs for real-world deepfake detection, we first propose the Fake Speech Wild (FSW) dataset, which includes 254 hours of real and deepfake audio from four different media platforms, focusing on social media. As CMs, we establish a benchmark using public datasets and advanced selfsupervised learning (SSL)-based CMs to evaluate current CMs in real-world scenarios. We also assess the effectiveness of data augmentation strategies in enhancing CM robustness for detecting deepfake speech on social media. Finally, by augmenting public datasets and incorporating the FSW training set, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.