Likelihood-ratio calibration using prior-weighted proper scoring rules
Niko Br\"ummer, George Doddington

TL;DR
This paper introduces a generalized calibration method using prior-weighted proper scoring rules, analyzing their theoretical properties and demonstrating improved accuracy in speaker recognition tasks with low false-alarm rates.
Contribution
It extends prior-weighted logistic regression by incorporating a parametric family of proper scoring rules, providing a theoretical framework and empirical evidence for improved calibration.
Findings
Scoring rules emphasizing higher thresholds improve accuracy in low false-alarm scenarios.
Theoretical analysis reveals how prior weighting affects calibration performance.
Experiments on NIST SRE'12 support the proposed method's effectiveness.
Abstract
Prior-weighted logistic regression has become a standard tool for calibration in speaker recognition. Logistic regression is the optimization of the expected value of the logarithmic scoring rule. We generalize this via a parametric family of proper scoring rules. Our theoretical analysis shows how different members of this family induce different relative weightings over a spectrum of applications of which the decision thresholds range from low to high. Special attention is given to the interaction between prior weighting and proper scoring rule parameters. Experiments on NIST SRE'12 suggest that for applications with low false-alarm rate requirements, scoring rules tailored to emphasize higher score thresholds may give better accuracy than logistic regression.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLogistic Regression
