Generalized Fake Audio Detection via Deep Stable Learning

Zhiyong Wang; Ruibo Fu; Zhengqi Wen; Yuankun Xie; Yukun Liu; Xiaopeng; Wang; Xuefei Liu; Yongwei Li; Jianhua Tao; Yi Lu; Xin Qi; Shuchen Shi

arXiv:2406.03237·cs.SD·June 6, 2024·1 cites

Generalized Fake Audio Detection via Deep Stable Learning

Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng, Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, Shuchen Shi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a stable learning method with a Sample Weight Learning module to improve fake audio detection models' ability to generalize across datasets with different distributions, without requiring extra data.

Contribution

It proposes a portable plug-in SWL module that decorrelates features to enhance model generalization across diverse datasets, simplifying training.

Findings

01

SWL improves model generalization across multiple datasets.

02

The method is easy to integrate with existing models.

03

Experiments show significant performance gains.

Abstract

Although current fake audio detection approaches have achieved remarkable success on specific datasets, they often fail when evaluated with datasets from different distributions. Previous studies typically address distribution shift by focusing on using extra data or applying extra loss restrictions during training. However, these methods either require a substantial amount of data or complicate the training process. In this work, we propose a stable learning-based training scheme that involves a Sample Weight Learning (SWL) module, addressing distribution shift by decorrelating all selected features via learning weights from training samples. The proposed portable plug-in-like SWL is easy to apply to multiple base models and generalizes them without using extra data during training. Experiments conducted on the ASVspoof datasets clearly demonstrate the effectiveness of SWL in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

john852517791/pytorch_lightning_FAD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Music and Audio Processing · Generative Adversarial Networks and Image Synthesis