A Generative Model for Score Normalization in Speaker Recognition
Albert Swart, Niko Brummer

TL;DR
This paper introduces a theoretical framework for score normalization in speaker recognition, explaining when it is necessary and proposing a generative model that improves normalization strategies like ZT-norm.
Contribution
It provides the first probabilistic generative model for score normalization, bridging theory and practical improvements in speaker recognition systems.
Findings
The model performs similarly to ZT-norm on RSR 2015 database.
Score normalization benefits are explained under data-set shift conditions.
Theoretical insights justify the use of normalization strategies.
Abstract
We propose a theoretical framework for thinking about score normalization, which confirms that normalization is not needed under (admittedly fragile) ideal conditions. If, however, these conditions are not met, e.g. under data-set shift between training and runtime, our theory reveals dependencies between scores that could be exploited by strategies such as score normalization. Indeed, it has been demonstrated over and over experimentally, that various ad-hoc score normalization recipes do work. We present a first attempt at using probability theory to design a generative score-space normalization model which gives similar improvements to ZT-norm on the text-dependent RSR 2015 database.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
