Generalizable Audio Deepfake Detection via Hierarchical Structure Learning and Feature Whitening in Poincar\'e sphere
Mingru Yang, Yanmei Gu, Qianhua He, Yanxiong Li, Peirong Zhang, Yongqiang Chen, Zhiming Wang, Huijia Zhu, Jian Liu, Weiqiang Wang

TL;DR
This paper introduces Poin-HierNet, a novel framework for audio deepfake detection that constructs hierarchical, domain-invariant representations in the Poincaré sphere to improve generalization across diverse attacks and domains.
Contribution
The paper proposes a new hierarchical learning framework in the Poincaré sphere, combining prototype learning, structure learning, and feature whitening for robust audio deepfake detection.
Findings
Outperforms state-of-the-art methods on multiple datasets
Achieves lower Equal Error Rate across diverse conditions
Demonstrates strong domain generalization capabilities
Abstract
Audio deepfake detection (ADD) faces critical generalization challenges due to diverse real-world spoofing attacks and domain variations. However, existing methods primarily rely on Euclidean distances, failing to adequately capture the intrinsic hierarchical structures associated with attack categories and domain factors. To address these issues, we design a novel framework Poin-HierNet to construct domain-invariant hierarchical representations in the Poincar\'e sphere. Poin-HierNet includes three key components: 1) Poincar\'e Prototype Learning (PPL) with several data prototypes aligning sample features and capturing multilevel hierarchies beyond human labels; 2) Hierarchical Structure Learning (HSL) leverages top prototypes to establish a tree-like hierarchical structure from data prototypes; and 3) Poincar\'e Feature Whitening (PFW) enhances domain invariance by applying feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Adversarial Robustness in Machine Learning
