Integrated Replay Spoofing-aware Text-independent Speaker Verification
Hye-jin Shim, Jee-weon Jung, Ju-ho Kim, Seung-bin Kim, Ha-Jin Yu

TL;DR
This paper explores methods to integrate speaker verification and presentation attack detection, proposing a modular approach that improves system performance by 21.77% on a benchmark dataset.
Contribution
It introduces a back-end modular approach for integrating speaker verification and attack detection, addressing limitations of end-to-end multi-task learning.
Findings
Back-end modular approach outperforms monolithic method.
Achieved 21.77% relative EER improvement.
Validated on ASVspoof 2017-v2 dataset.
Abstract
A number of studies have successfully developed speaker verification or presentation attack detection systems. However, studies integrating the two tasks remain in the preliminary stages. In this paper, we propose two approaches for building an integrated system of speaker verification and presentation attack detection: an end-to-end monolithic approach and a back-end modular approach. The first approach simultaneously trains speaker identification, presentation attack detection, and the integrated system using multi-task learning using a common feature. However, through experiments, we hypothesize that the information required for performing speaker verification and presentation attack detection might differ because speaker verification systems try to remove device-specific information from speaker embeddings, while presentation attack detection systems exploit such information.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
