SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice   Anti-Spoofing

Siwen Ding; You Zhang; Zhiyao Duan

arXiv:2211.02718·eess.AS·November 8, 2022

SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing

Siwen Ding, You Zhang, Zhiyao Duan

PDF

Open Access 2 Repos

TL;DR

This paper introduces SAMO, a novel speaker attractor multi-center one-class learning approach that enhances voice anti-spoofing by better modeling speaker diversity and improving detection of unseen attacks.

Contribution

SAMO clusters bona fide speech around multiple speaker attractors and co-optimizes clustering with spoof detection, advancing anti-spoofing performance.

Findings

01

Outperforms state-of-the-art systems with 38% relative EER reduction.

02

Effectively handles speakers without enrollment.

03

Improves generalization to unseen speech synthesis attacks.

Abstract

Voice anti-spoofing systems are crucial auxiliaries for automatic speaker verification (ASV) systems. A major challenge is caused by unseen attacks empowered by advanced speech synthesis technologies. Our previous research on one-class learning has improved the generalization ability to unseen attacks by compacting the bona fide speech in the embedding space. However, such compactness lacks consideration of the diversity of speakers. In this work, we propose speaker attractor multi-center one-class learning (SAMO), which clusters bona fide speech around a number of speaker attractors and pushes away spoofing attacks from all the attractors in a high-dimensional embedding space. For training, we propose an algorithm for the co-optimization of bona fide speech clustering and bona fide/spoof classification. For inference, we propose strategies to enable anti-spoofing for speakers without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders