# Incorporating Uncertainty from Speaker Embedding Estimation to Speaker   Verification

**Authors:** Qiongqiong Wang, Kong Aik Lee, Tianchi Liu

arXiv: 2302.11763 · 2023-02-24

## TL;DR

This paper enhances speaker verification by integrating uncertainty estimates from speaker embeddings into the scoring process, leading to significant improvements in verification accuracy across multiple datasets.

## Contribution

It introduces a method to incorporate embedding uncertainty into PLDA scoring, including a new posterior covariance derivation and a length scaling technique, improving verification performance.

## Key findings

- 14.5%-41.3% EER reduction on VoxCeleb-1 and SITW datasets
- Effective uncertainty propagation improves speaker verification accuracy
- Significant reductions in minDCF across tested datasets

## Abstract

Speech utterances recorded under differing conditions exhibit varying degrees of confidence in their embedding estimates, i.e., uncertainty, even if they are extracted using the same neural network. This paper aims to incorporate the uncertainty estimate produced in the xi-vector network front-end with a probabilistic linear discriminant analysis (PLDA) back-end scoring for speaker verification. To achieve this we derive a posterior covariance matrix, which measures the uncertainty, from the frame-wise precisions to the embedding space. We propose a log-likelihood ratio function for the PLDA scoring with the uncertainty propagation. We also propose to replace the length normalization pre-processing technique with a length scaling technique for the application of uncertainty propagation in the back-end. Experimental results on the VoxCeleb-1, SITW test sets as well as a domain-mismatched CNCeleb1-E set show the effectiveness of the proposed techniques with 14.5%-41.3% EER reductions and 4.6%-25.3% minDCF reductions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.11763/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/2302.11763/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/2302.11763/full.md

---
Source: https://tomesphere.com/paper/2302.11763