The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio
Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao,, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun

TL;DR
This paper introduces the Codecfake dataset and a novel countermeasure, CSAM, to improve the universal detection of ALM-based deepfake audio, demonstrating significant reduction in error rates across diverse test conditions.
Contribution
The paper presents the large-scale Codecfake dataset and a domain-balanced CSAM strategy to enhance generalization in deepfake audio detection.
Findings
Effective detection of ALM-based deepfake audio using the dataset.
CSAM achieves the lowest EER of 0.616% across test conditions.
Dataset and code are publicly available.
Abstract
With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for generalized detection methods. ALM-based deepfake audio currently exhibits widespread, high deception, and type versatility, posing a significant challenge to current audio deepfake detection (ADD) models trained solely on vocoded data. To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method, the conversion from neural codec to waveform. We initially constructed the Codecfake dataset, an open-source, large-scale collection comprising over 1 million audio samples in both English and Chinese, focus on ALM-based audio detection. As countermeasure, to achieve universal detection of deepfake audio and tackle domain ascent bias issue of original sharpness aware minimization (SAM), we propose the CSAM strategy to learn a domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection
MethodsAttentive Walk-Aggregating Graph Neural Network · Focus · Segment Anything Model
