Adversarial Speaker Distillation for Countermeasure Model on Automatic   Speaker Verification

Yen-Lun Liao; Xuanjun Chen; Chung-Che Wang; Jyh-Shing Roger Jang

arXiv:2203.17031·cs.SD·February 17, 2025·1 cites

Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification

Yen-Lun Liao, Xuanjun Chen, Chung-Che Wang, Jyh-Shing Roger Jang

PDF

Open Access

TL;DR

This paper introduces an adversarial speaker distillation technique to create compact, secure countermeasure models for automatic speaker verification, achieving high performance with significantly reduced model size.

Contribution

It proposes an adversarial speaker distillation method combining knowledge distillation, GE2E pre-training, and adversarial fine-tuning for resource-efficient CM models.

Findings

01

Achieved 0.2695 min t-DCF and 3.54% EER on ASVspoof 2021 dataset.

02

Model uses only 22.5% of parameters of the original ResNetSE.

03

Significantly reduces computational requirements while maintaining performance.

Abstract

The countermeasure (CM) model is developed to protect ASV systems from spoof attacks and prevent resulting personal information leakage in Automatic Speaker Verification (ASV) system. Based on practicality and security considerations, the CM model is usually deployed on edge devices, which have more limited computing resources and storage space than cloud-based systems, confining the model size under a limitation. To better trade off the CM model sizes and performance, we proposed an adversarial speaker distillation method, which is an improved version of knowledge distillation method combined with generalized end-to-end (GE2E) pre-training and adversarial fine-tuning. In the evaluation phase of the ASVspoof 2021 Logical Access task, our proposed adversarial speaker distillation ResNetSE (ASD-ResNetSE) model reaches 0.2695 min t-DCF and 3.54% EER. ASD-ResNetSE only used 22.5% of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Adversarial Robustness in Machine Learning

MethodsKnowledge Distillation