Optimizing a-DCF for Spoofing-Robust Speaker Verification
O\u{g}uzhan Kurnaz, Jagabandhu Mishra, Tomi H. Kinnunen, and Cemal, Hanil\c{c}i

TL;DR
This paper introduces an optimized spoofing-robust speaker verification system using a-DCF, achieving significant improvements in detection cost function over previous methods by combining a-DCF with BCE and novel thresholding.
Contribution
It presents a new method that directly optimizes a-DCF for spoofing-robust speaker verification, integrating threshold optimization and fusion techniques for enhanced performance.
Findings
13% relative improvement over BCE-only system
43% relative improvement with non-linear score fusion
Significant reduction in minimum a-DCF scores
Abstract
Automatic speaker verification (ASV) systems are vulnerable to spoofing attacks. We propose a spoofing-robust ASV system optimized directly for the recently introduced architecture-agnostic detection cost function (a-DCF), which allows targeting a desired trade-off between the contradicting aims of user convenience and robustness to spoofing. We combine a-DCF and binary cross-entropy (BCE) with a novel straightforward threshold optimization technique. Our results with an embedding fusion system on ASVspoof2019 data demonstrate relative improvement of over a system trained using BCE only (from minimum a-DCF of to ). Using an alternative non-linear score fusion approach provides relative improvement of (from minimum a-DCF of to ).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Speech Recognition and Synthesis
