Continuous Learning of Transformer-based Audio Deepfake Detection

Tuan Duy Nguyen Le; Kah Kuan Teh; Huy Dat Tran

arXiv:2409.05924·cs.SD·September 11, 2024·2 cites

Continuous Learning of Transformer-based Audio Deepfake Detection

Tuan Duy Nguyen Le, Kah Kuan Teh, Huy Dat Tran

PDF

Open Access

TL;DR

This paper introduces a new audio deepfake detection framework using Audio Spectrogram Transformer, emphasizing high accuracy and effective continuous learning with minimal labeled data, demonstrated through extensive dataset collection and augmentation.

Contribution

It presents a novel continuous learning plugin for audio deepfake detection that outperforms traditional fine-tuning with fewer labeled examples.

Findings

01

Achieved high accuracy on multiple benchmark datasets.

02

Developed a continuous learning plugin that requires minimal labeled data.

03

Enhanced detection robustness through diverse data augmentation.

Abstract

This paper proposes a novel framework for audio deepfake detection with two main objectives: i) attaining the highest possible accuracy on available fake data, and ii) effectively performing continuous learning on new fake data in a few-shot learning manner. Specifically, we conduct a large audio deepfake collection using various deep audio generation methods. The data is further enhanced with additional augmentation methods to increase variations amidst compressions, far-field recordings, noise, and other distortions. We then adopt the Audio Spectrogram Transformer for the audio deepfake detection model. Accordingly, the proposed method achieves promising performance on various benchmark datasets. Furthermore, we present a continuous learning plugin module to update the trained model most effectively with the fewest possible labeled data points of the new fake type. The proposed method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Image and Signal Denoising Methods · Music and Audio Processing