Two-Path GMM-ResNet and GMM-SENet for ASV Spoofing Detection
Zhenchun Lei, Hui Yan, Changhong Liu, Minglei Ma, Yingen Yang

TL;DR
This paper introduces two novel neural network models, GMM-ResNet and GMM-SENet, that leverage Gaussian features for improved spoofing detection in speaker verification systems, significantly outperforming traditional GMM classifiers.
Contribution
The paper proposes two new models that incorporate frame relationships and Gaussian features, enhancing spoofing detection accuracy over existing GMM-based methods.
Findings
Significant reduction in min-tDCF and EER on ASVspoof 2019 datasets.
GMM-ResNet and GMM-SENet outperform traditional GMM classifiers.
Score fusion achieves second-best results in evaluations.
Abstract
The automatic speaker verification system is sometimes vulnerable to various spoofing attacks. The 2-class Gaussian Mixture Model classifier for genuine and spoofed speech is usually used as the baseline for spoofing detection. However, the GMM classifier does not separately consider the scores of feature frames on each Gaussian component. In addition, the GMM accumulates the scores on all frames independently, and does not consider their correlations. We propose the two-path GMM-ResNet and GMM-SENet models for spoofing detection, whose input is the Gaussian probability features based on two GMMs trained on genuine and spoofed speech respectively. The models consider not only the score distribution on GMM components, but also the relationship between adjacent frames. A two-step training scheme is applied to improve the system robustness. Experiments on the ASVspoof 2019 show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
