Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake   Audio

Xinrui Yan; Jiangyan Yi; Jianhua Tao; Yujie Chen; Hao Gu; Guanjun Li,; Junzuo Zhou; Yong Ren; Tao Xu

arXiv:2412.01425·cs.SD·December 3, 2024

Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio

Xinrui Yan, Jiangyan Yi, Jianhua Tao, Yujie Chen, Hao Gu, Guanjun Li,, Junzuo Zhou, Yong Ren, Tao Xu

PDF

Open Access

TL;DR

This paper introduces ReTA, a novel framework that adaptively determines rejection thresholds for open-set deepfake audio attribution, addressing overconfidence and distribution shift issues in existing methods.

Contribution

ReTA employs reconstruction error distribution modeling and Gaussian probability estimation to adaptively set reject thresholds, enhancing open-set deepfake audio attribution.

Findings

01

ReTA improves detection accuracy of unknown deepfake audio sources.

02

Adaptive thresholds outperform fixed thresholds in open-set scenarios.

03

Experimental results validate the effectiveness of the proposed method.

Abstract

Open environment oriented open set model attribution of deepfake audio is an emerging research topic, aiming to identify the generation models of deepfake audio. Most previous work requires manually setting a rejection threshold for unknown classes to compare with predicted probabilities. However, models often overfit training instances and generate overly confident predictions. Moreover, thresholds that effectively distinguish unknown categories in the current dataset may not be suitable for identifying known and unknown categories in another data distribution. To address the issues, we propose a novel framework for open set model attribution of deepfake audio with rejection threshold adaptation (ReTA). Specifically, the reconstruction error learning module trains by combining the representation of system fingerprints with labels corresponding to either the target class or a randomly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Image and Signal Denoising Methods · Music and Audio Processing

MethodsSparse Evolutionary Training