Phase-Aware Spoof Speech Detection Based on Res2Net with Phase Network
Juntae Kim, Sung Min Ban

TL;DR
This paper introduces a phase-aware spoof speech detection system using Res2Net with a dedicated phase network, improving detection accuracy by effectively integrating magnitude and phase features for diverse spoofing attacks.
Contribution
The paper proposes a novel phase network to reduce feature randomness, enabling effective fusion of magnitude and phase features in spoof speech detection.
Findings
Significant performance improvement in spoof detection, especially for attacks where phase info is crucial.
Effective handling of feature randomness enhances generalization to unknown spoofing attacks.
Demonstrated robustness in both known- and unknown-kind spoofing scenarios.
Abstract
The spoof speech detection (SSD) is the essential countermeasure for automatic speaker verification systems. Although SSD with magnitude features in the frequency domain has shown promising results, the phase information also can be important to capture the artefacts of certain types of spoofing attacks. Thus, both magnitude and phase features must be considered to ensure the generalization ability to diverse types of spoofing attacks. In this paper, we investigate the failure reason of feature-level fusion of the previous works through the entropy analysis from which we found that the randomness difference between magnitude and phase features is large, which can interrupt the feature-level fusion via backend neural network; thus, we propose a phase network to reduce that difference. Our SSD system: phase network equipped Res2Net achieved significant performance improvement,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Geophysical Methods and Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection · Average Pooling · Non Maximum Suppression · Kaiming Initialization · Res2Net Block · Global Average Pooling · Convolution · Batch Normalization · 1x1 Convolution
