Region-Based Optimization in Continual Learning for Audio Deepfake Detection
Yujie Chen, Jiangyan Yi, Cunhang Fan, Jianhua Tao, Yong Ren, Siding, Zeng, Chu Yuan Zhang, Xinrui Yan, Hao Gu, Jun Xue, Chenglong Wang, Zhao Lv,, Xiaohui Zhang

TL;DR
This paper introduces Region-Based Optimization (RegO), a continual learning method that enhances audio deepfake detection by adaptively optimizing neuron regions, significantly improving performance and generalization over existing approaches.
Contribution
The paper presents a novel region-adaptive optimization technique using Fisher information and Ebbinghaus forgetting to improve continual learning in audio deepfake detection.
Findings
Achieves 21.3% improvement in EER over RWM
Effective in reducing redundant neurons and promoting generalization
Potential applicability to other domains like image recognition
Abstract
Rapid advancements in speech synthesis and voice conversion bring convenience but also new security risks, creating an urgent need for effective audio deepfake detection. Although current models perform well, their effectiveness diminishes when confronted with the diverse and evolving nature of real-world deepfakes. To address this issue, we propose a continual learning method named Region-Based Optimization (RegO) for audio deepfake detection. Specifically, we use the Fisher information matrix to measure important neuron regions for real and fake audio detection, dividing them into four regions. First, we directly fine-tune the less important regions to quickly adapt to new tasks. Next, we apply gradient optimization in parallel for regions important only to real audio detection, and in orthogonal directions for regions important only to fake audio detection. For regions that are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Digital Media Forensic Detection · Speech and Audio Processing
