SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker   Verification System

Zhongwei Teng; Quchen Fu; Jules White; Maria E. Powell; Douglas C.; Schmidt

arXiv:2203.06517·cs.SD·March 28, 2022·1 cites

SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker Verification System

Zhongwei Teng, Quchen Fu, Jules White, Maria E. Powell, Douglas C., Schmidt

PDF

Open Access

TL;DR

This paper introduces SA-SASV, an end-to-end spoof-aware speaker verification system that integrates speaker verification and anti-spoofing in a unified model, improving performance with multi-task learning on a limited dataset.

Contribution

It presents a novel ensemble-free, end-to-end multi-task model for joint speaker verification and spoof detection, addressing training flexibility and performance enhancement.

Findings

01

Improved SASV-EER on the ASVSpoof 2019 LA dataset.

02

Training on combined datasets further enhances performance.

03

End-to-end multi-task approach outperforms separate systems.

Abstract

Research in the past several years has boosted the performance of automatic speaker verification systems and countermeasure systems to deliver low Equal Error Rates (EERs) on each system. However, research on joint optimization of both systems is still limited. The Spoofing-Aware Speaker Verification (SASV) 2022 challenge was proposed to encourage the development of integrated SASV systems with new metrics to evaluate joint model performance. This paper proposes an ensemble-free end-to-end solution, known as Spoof-Aggregated-SASV (SA-SASV) to build a SASV system with multi-task classifiers, which are optimized by multiple losses and has more flexible requirements in training set. The proposed system is trained on the ASVSpoof 2019 LA dataset, a spoof verification dataset with small number of bonafide speakers. Results of SASV-EER indicate that the model performance can be further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing