The BOSARIS Toolkit: Theory, Algorithms and Code for Surviving the New DCF
Niko Br\"ummer, Edward de Villiers

TL;DR
The BOSARIS Toolkit provides advanced algorithms and tools for calibration, evaluation, and fusion of speaker recognition systems under the challenging new DCF criterion, enabling effective handling of large datasets.
Contribution
It introduces novel algorithms and tools for calibration, evaluation, and fusion tailored to the new DCF, including visualization, efficient computation, and large-scale data handling.
Findings
Normalized Bayes Error-Rate Plot for calibration analysis
Efficient algorithms for large score files and DCF computation
Enhanced logistic regression optimizer for system fusion
Abstract
The change of two orders of magnitude in the 'new DCF' of NIST's SRE'10, relative to the 'old DCF' evaluation criterion, posed a difficult challenge for participants and evaluator alike. Initially, participants were at a loss as to how to calibrate their systems, while the evaluator underestimated the required number of evaluation trials. After the fact, it is now obvious that both calibration and evaluation require very large sets of trials. This poses the challenges of (i) how to decide what number of trials is enough, and (ii) how to process such large data sets with reasonable memory and CPU requirements. After SRE'10, at the BOSARIS Workshop, we built solutions to these problems into the freely available BOSARIS Toolkit. This paper explains the principles and algorithms behind this toolkit. The main contributions of the toolkit are: 1. The Normalized Bayes Error-Rate Plot, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Anomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques
