Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio
Xinrui Yan, Jiangyan Yi, Jianhua Tao, Yujie Chen, Hao Gu, Guanjun Li,, Junzuo Zhou, Yong Ren, Tao Xu

TL;DR
This paper introduces ReTA, a novel framework that adaptively determines rejection thresholds for open-set deepfake audio attribution, addressing overconfidence and distribution shift issues in existing methods.
Contribution
ReTA employs reconstruction error distribution modeling and Gaussian probability estimation to adaptively set reject thresholds, enhancing open-set deepfake audio attribution.
Findings
ReTA improves detection accuracy of unknown deepfake audio sources.
Adaptive thresholds outperform fixed thresholds in open-set scenarios.
Experimental results validate the effectiveness of the proposed method.
Abstract
Open environment oriented open set model attribution of deepfake audio is an emerging research topic, aiming to identify the generation models of deepfake audio. Most previous work requires manually setting a rejection threshold for unknown classes to compare with predicted probabilities. However, models often overfit training instances and generate overly confident predictions. Moreover, thresholds that effectively distinguish unknown categories in the current dataset may not be suitable for identifying known and unknown categories in another data distribution. To address the issues, we propose a novel framework for open set model attribution of deepfake audio with rejection threshold adaptation (ReTA). Specifically, the reconstruction error learning module trains by combining the representation of system fingerprints with labels corresponding to either the target class or a randomly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Image and Signal Denoising Methods · Music and Audio Processing
MethodsSparse Evolutionary Training
