Continual Learning for Fake Audio Detection
Haoxin Ma, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Chenglong, Wang

TL;DR
This paper introduces a continual learning approach for fake audio detection that effectively learns new spoofing attacks without forgetting previous knowledge, outperforming traditional fine-tuning methods.
Contribution
It proposes a novel continual-learning-based method with knowledge distillation and embedding similarity loss to improve fake audio detection across unseen attacks.
Findings
Outperforms fine-tuning with up to 81.62% EER reduction
Effective in incremental learning of new spoofing attacks
Maintains performance on previous data without retraining
Abstract
Fake audio attack becomes a major threat to the speaker verification system. Although current detection approaches have achieved promising results on dataset-specific scenarios, they encounter difficulties on unseen spoofing data. Fine-tuning and retraining from scratch have been applied to incorporate new data. However, fine-tuning leads to performance degradation on previous data. Retraining takes a lot of time and computation resources. Besides, previous data are unavailable due to privacy in some situations. To solve the above problems, this paper proposes detecting fake without forgetting, a continual-learning-based method, to make the model learn new spoofing attacks incrementally. A knowledge distillation loss is introduced to loss function to preserve the memory of original model. Supposing the distribution of genuine voice is consistent among different scenarios, an extra…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
