An Entropic Metric for Measuring Calibration of Machine Learning Models
Daniel James Sumler, Lee Devlin, Simon Maskell, Richard O. Lane

TL;DR
This paper introduces the Entropic Calibration Difference (ECD), a new metric for assessing the confidence calibration of machine learning models, distinguishing between under- and over-confidence, and demonstrating its effectiveness on real and simulated data.
Contribution
The paper proposes a novel calibration metric, ECD, inspired by target tracking, that uniquely differentiates under- from over-confidence in machine learning models.
Findings
ECD effectively distinguishes under- and over-confidence.
ECD compares favorably with ECE and ESCE on various datasets.
The metric provides insights into model safety and statistical efficiency.
Abstract
Understanding the confidence with which a machine learning model classifies an input datum is an important, and perhaps under-investigated, concept. In this paper, we propose a new calibration metric, the Entropic Calibration Difference (ECD). Based on existing research in the field of state estimation, specifically target tracking (TT), we show how ECD may be applied to binary classification machine learning models. We describe the relative importance of under- and over-confidence and how they are not conflated in the TT literature. Indeed, our metric distinguishes under- from over-confidence. We consider this important given that algorithms that are under-confident are likely to be 'safer' than algorithms that are over-confident, albeit at the expense of also being over-cautious and so statistically inefficient. We demonstrate how this new metric performs on real and simulated data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
