Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Youngsik Eom, Yeonghyeon Lee, Ji Sub Um, Hoirin Kim

TL;DR
This paper introduces a transfer learning approach using wav2vec 2.0 with variational information bottleneck to enhance anti-spoofing in speaker verification, especially for unseen and low-resource scenarios.
Contribution
It presents a novel transfer learning scheme with VIB for speech anti-spoofing, improving generalization and robustness over existing methods.
Findings
Outperforms state-of-the-art anti-spoofing systems on ASVspoof 2019 LA database.
Enhances detection in low-resource and cross-dataset settings.
Demonstrates robustness to data size and distribution variations.
Abstract
Recent advances in sophisticated synthetic speech generated from text-to-speech (TTS) or voice conversion (VC) systems cause threats to the existing automatic speaker verification (ASV) systems. Since such synthetic speech is generated from diverse algorithms, generalization ability with using limited training data is indispensable for a robust anti-spoofing system. In this work, we propose a transfer learning scheme based on the wav2vec 2.0 pretrained model with variational information bottleneck (VIB) for speech anti-spoofing task. Evaluation on the ASVspoof 2019 logical access (LA) database shows that our method improves the performance of distinguishing unseen spoofed and genuine speech, outperforming current state-of-the-art anti-spoofing systems. Furthermore, we show that the proposed system improves performance in low-resource and cross-dataset settings of anti-spoofing task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders
