A discriminative condition-aware backend for speaker verification
Luciana Ferrer, Mitchell McLaren

TL;DR
This paper introduces a discriminative, condition-aware backend for speaker verification that jointly trains all parameters to optimize verification accuracy and integrates calibration, improving out-of-the-box performance without domain-specific data.
Contribution
The proposed backend uniquely combines joint training of model parameters with integrated, condition-dependent calibration, enhancing robustness in unknown test conditions.
Findings
Excellent out-of-the-box calibration performance
Effective in unknown test conditions
No need for domain-specific calibration data
Abstract
We present a scoring approach for speaker verification that mimics the standard PLDA-based backend process used in most current speaker verification systems. However, unlike the standard backends, all parameters of the model are jointly trained to optimize the binary cross-entropy for the speaker verification task. We further integrate the calibration stage inside the model, making the parameters of this stage depend on metadata vectors that represent the conditions of the signals. We show that the proposed backend has excellent out-of-the-box calibration performance on most of our test sets, making it an ideal approach for cases in which the test conditions are not known and development data is not available for training a domain-specific calibration model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
