Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker   Verification Using Compositional Data Analysis

Xin Wang; Tomi Kinnunen; Kong Aik Lee; Paul-Gauthier No\'e; Junichi; Yamagishi

arXiv:2406.10836·eess.AS·September 25, 2024

Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis

Xin Wang, Tomi Kinnunen, Kong Aik Lee, Paul-Gauthier No\'e, Junichi, Yamagishi

PDF

Open Access 1 Repo

TL;DR

This paper revisits score-level fusion in spoofing-aware speaker verification, emphasizing the importance of score calibration and proposing improved linear and non-linear fusion methods based on decision theory and compositional data analysis.

Contribution

It introduces a decision-theoretic interpretation of score fusion, highlighting the significance of score calibration and proposing novel fusion methods that outperform existing heuristics.

Findings

01

Score calibration before fusion improves robustness.

02

Linear fusion of log-likelihood ratios enhances performance.

03

Non-linear fusion yields the best decision accuracy.

Abstract

Fusing outputs from automatic speaker verification (ASV) and spoofing countermeasure (CM) is expected to make an integrated system robust to zero-effort imposters and synthesized spoofing attacks. Many score-level fusion methods have been proposed, but many remain heuristic. This paper revisits score-level fusion using tools from decision theory and presents three main findings. First, fusion by summing the ASV and CM scores can be interpreted on the basis of compositional data analysis, and score calibration before fusion is essential. Second, the interpretation leads to an improved fusion method that linearly combines the log-likelihood ratios of ASV and CM. However, as the third finding reveals, this linear combination is inferior to a non-linear one in making optimal decisions. The outcomes of these findings, namely, the score calibration before fusion, improved linear fusion, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nii-yamagishilab/speechspc-mini
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Advanced Chemical Sensor Technologies