Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis
Xin Wang, Tomi Kinnunen, Kong Aik Lee, Paul-Gauthier No\'e, Junichi, Yamagishi

TL;DR
This paper revisits score-level fusion in spoofing-aware speaker verification, emphasizing the importance of score calibration and proposing improved linear and non-linear fusion methods based on decision theory and compositional data analysis.
Contribution
It introduces a decision-theoretic interpretation of score fusion, highlighting the significance of score calibration and proposing novel fusion methods that outperform existing heuristics.
Findings
Score calibration before fusion improves robustness.
Linear fusion of log-likelihood ratios enhances performance.
Non-linear fusion yields the best decision accuracy.
Abstract
Fusing outputs from automatic speaker verification (ASV) and spoofing countermeasure (CM) is expected to make an integrated system robust to zero-effort imposters and synthesized spoofing attacks. Many score-level fusion methods have been proposed, but many remain heuristic. This paper revisits score-level fusion using tools from decision theory and presents three main findings. First, fusion by summing the ASV and CM scores can be interpreted on the basis of compositional data analysis, and score calibration before fusion is essential. Second, the interpretation leads to an improved fusion method that linearly combines the log-likelihood ratios of ASV and CM. However, as the third finding reveals, this linear combination is inferior to a non-linear one in making optimal decisions. The outcomes of these findings, namely, the score calibration before fusion, improved linear fusion, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Advanced Chemical Sensor Technologies
