An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio
Siding Zeng, Jiangyan Yi, Jianhua Tao, Yujie Chen, Shan Liang, Yong, Ren, Xiaohui Zhang

TL;DR
This paper introduces an unsupervised domain adaptation method called SDE for locating manipulated regions in partially fake audio, significantly improving detection performance across different datasets by selecting informative samples and generating labels.
Contribution
The paper proposes a novel unsupervised domain adaptation approach using diverse experts and entropy-based sample selection for improved fake audio detection across domains.
Findings
Achieved 43.84% F1 score with 10% target domain samples.
77.2% relative improvement over the second-best method.
Effective cross-domain fake audio localization demonstrated.
Abstract
When the task of locating manipulation regions in partially-fake audio (PFA) involves cross-domain datasets, the performance of deep learning models drops significantly due to the shift between the source and target domains. To address this issue, existing approaches often employ data augmentation before training. However, they overlook the characteristics in target domain that are absent in source domain. Inspired by the mixture-of-experts model, we propose an unsupervised method named Samples mining with Diversity and Entropy (SDE). Our method first learns from a collection of diverse experts that achieve great performance from different perspectives in the source domain, but with ambiguity on target samples. We leverage these diverse experts to select the most informative samples by calculating their entropy. Furthermore, we introduced a label generation method tailored for these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Music and Audio Processing · Speech and Audio Processing
