Listen, Analyze, and Adapt to Learn New Attacks: An Exemplar-Free Class Incremental Learning Method for Audio Deepfake Source Tracing
Yang Xiao, Rohan Kumar Das

TL;DR
This paper introduces AnaST, a novel exemplar-free class incremental learning method for audio deepfake source tracing that effectively adapts to new attacks while preventing catastrophic forgetting, with advantages in privacy and efficiency.
Contribution
AnaST is a new analytic class incremental learning approach that updates the classifier with a closed-form solution, maintaining fixed feature extraction for online audio deepfake source tracing.
Findings
Outperforms baseline methods in experiments
Ensures data privacy and efficient memory usage
Suitable for online training scenarios
Abstract
As deepfake speech becomes common and hard to detect, it is vital to trace its source. Recent work on audio deepfake source tracing (ST) aims to find the origins of synthetic or manipulated speech. However, ST models must adapt to learn new deepfake attacks while retaining knowledge of the previous ones. A major challenge is catastrophic forgetting, where models lose the ability to recognize previously learned attacks. Some continual learning methods help with deepfake detection, but multi-class tasks such as ST introduce additional challenges as the number of classes grows. To address this, we propose an analytic class incremental learning method called AnaST. When new attacks appear, the feature extractor remains fixed, and the classifier is updated with a closed-form analytical solution in one epoch. This approach ensures data privacy, optimizes memory usage, and is suitable for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Digital Media Forensic Detection · Speech Recognition and Synthesis
