Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment   Utterances and a Sampling Strategy for the SASV Challenge 2022

Chang Zeng; Lin Zhang; Meng Liu; Junichi Yamagishi

arXiv:2209.00423·eess.AS·October 27, 2022·Interspeech

Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022

Chang Zeng, Lin Zhang, Meng Liu, Junichi Yamagishi

PDF

Open Access

TL;DR

This paper introduces a spoofing-aware back-end for automatic speaker verification that combines speaker and spoofing scores with attention mechanisms, and proposes a new sampling strategy for spoofing scenarios.

Contribution

It presents a novel attention-based fusion module for ASV and countermeasures, and a sampling strategy for improved spoofing scenario simulation.

Findings

01

Enhanced fusion of speaker and spoof scores using attention mechanisms

02

Improved robustness in spoofing detection scenarios

03

Effective simulation of spoofing scenarios for SASV challenge

Abstract

Current state-of-the-art automatic speaker verification (ASV) systems are vulnerable to presentation attacks, and several countermeasures (CMs), which distinguish bona fide trials from spoofing ones, have been explored to protect ASV. However, ASV systems and CMs are generally developed and optimized independently without considering their inter-relationship. In this paper, we propose a new spoofing-aware ASV back-end module that efficiently computes a combined ASV score based on speaker similarity and CM score. In addition to the learnable fusion function of the two scores, the proposed back-end module has two types of attention components, scaled-dot and feed-forward self-attention, so that intra-relationship information of multiple enrollment utterances can also be learned at the same time. Moreover, a new effective trials-sampling strategy is designed for simulating new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis