Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa, Marcin Plata, Piotr Syga

TL;DR
This paper introduces the Attack Agnostic Dataset to improve generalization and robustness of audio DeepFake detection methods, proposing a new LCNN-based model that outperforms existing approaches in stability and accuracy.
Contribution
The work presents a novel dataset combining multiple DeepFake and anti-spoofing datasets for better generalization, and introduces an LCNN model with LFCC and mel-spectrogram features that enhances detection stability.
Findings
The proposed dataset improves detection generalization across unseen attacks.
The LCNN model with LFCC and mel-spectrogram features reduces variability and error rates.
The model shows up to 5% improvement in EER over baseline methods.
Abstract
Audio DeepFakes allow the creation of high-quality, convincing utterances and therefore pose a threat due to its potential applications such as impersonation or fake news. Methods for detecting these manipulations should be characterized by good generalization and stability leading to robustness against attacks conducted with techniques that are not explicitly included in the training. In this work, we introduce Attack Agnostic Dataset - a combination of two audio DeepFakes and one anti-spoofing datasets that, thanks to the disjoint use of attacks, can lead to better generalization of detection methods. We present a thorough analysis of current DeepFake detection methods and consider different audio features (front-ends). In addition, we propose a model based on LCNN with LFCC and mel-spectrogram front-end, which not only is characterized by a good generalization and stability results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Music and Audio Processing · Speech Recognition and Synthesis
