FakeSound: Deepfake General Audio Detection

Zeyu Xie; Baihan Li; Xuenan Xu; Zheng Liang; Kai Yu; and Mengyue Wu

arXiv:2406.08052·cs.SD·June 13, 2024

FakeSound: Deepfake General Audio Detection

Zeyu Xie, Baihan Li, Xuenan Xu, Zheng Liang, Kai Yu, and Mengyue Wu

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces FakeSound, a new dataset and model for detecting deepfake audio, highlighting the difficulty humans face in identifying manipulated audio and demonstrating the model's superior performance over existing methods.

Contribution

The paper presents FakeSound, a comprehensive dataset for deepfake audio detection, and a benchmark model that outperforms current state-of-the-art techniques and human accuracy.

Findings

01

Humans have less than 60% accuracy in detecting deepfake audio.

02

The proposed model surpasses existing methods in deepfake speech detection.

03

FakeSound dataset effectively challenges both humans and models in deepfake detection.

Abstract

With the advancement of audio generation, generative models can produce highly realistic audios. However, the proliferation of deepfake general audio can pose negative consequences. Therefore, we propose a new task, deepfake general audio detection, which aims to identify whether audio content is manipulated and to locate deepfake regions. Leveraging an automated manipulation pipeline, a dataset named FakeSound for deepfake general audio detection is proposed, and samples can be viewed on website https://FakeSoundData.github.io. The average binary accuracy of humans on all test sets is consistently below 0.6, which indicates the difficulty humans face in discerning deepfake audio and affirms the efficacy of the FakeSound dataset. A deepfake detection model utilizing a general audio pre-trained model is proposed as a benchmark system. Experimental results demonstrate that the performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

FakeSoundData/FakeSound
pytorchOfficial

Datasets

ZeyuXie/FakeSound
dataset· 13 dl
13 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Music and Audio Processing