FakeSound: Deepfake General Audio Detection
Zeyu Xie, Baihan Li, Xuenan Xu, Zheng Liang, Kai Yu, and Mengyue Wu

TL;DR
This paper introduces FakeSound, a new dataset and model for detecting deepfake audio, highlighting the difficulty humans face in identifying manipulated audio and demonstrating the model's superior performance over existing methods.
Contribution
The paper presents FakeSound, a comprehensive dataset for deepfake audio detection, and a benchmark model that outperforms current state-of-the-art techniques and human accuracy.
Findings
Humans have less than 60% accuracy in detecting deepfake audio.
The proposed model surpasses existing methods in deepfake speech detection.
FakeSound dataset effectively challenges both humans and models in deepfake detection.
Abstract
With the advancement of audio generation, generative models can produce highly realistic audios. However, the proliferation of deepfake general audio can pose negative consequences. Therefore, we propose a new task, deepfake general audio detection, which aims to identify whether audio content is manipulated and to locate deepfake regions. Leveraging an automated manipulation pipeline, a dataset named FakeSound for deepfake general audio detection is proposed, and samples can be viewed on website https://FakeSoundData.github.io. The average binary accuracy of humans on all test sets is consistently below 0.6, which indicates the difficulty humans face in discerning deepfake audio and affirms the efficacy of the FakeSound dataset. A deepfake detection model utilizing a general audio pre-trained model is proposed as a benchmark system. Experimental results demonstrate that the performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Music and Audio Processing
