Decoupling of neural network calibration measures
Dominik Werner Wolf, Prasannavenkatesh Balaji, Alexander Braun, and Markus Ulrich

TL;DR
This paper examines the inconsistencies in neural network calibration measures like ECE and AUSE, highlighting their limitations in safety-critical applications and proposing AUSE as an indirect residual uncertainty measure.
Contribution
It reveals the coupling issues among calibration metrics and introduces AUSE as a measure of residual uncertainty in neural networks.
Findings
Calibration measures like ECE and AUSE are inconsistent for safety-critical systems.
Current methodologies do not allow a unique calibration, affecting model safety.
AUSE can serve as an indirect measure of residual uncertainty in neural networks.
Abstract
A lot of effort is currently invested in safeguarding autonomous driving systems, which heavily rely on deep neural networks for computer vision. We investigate the coupling of different neural network calibration measures with a special focus on the Area Under the Sparsification Error curve (AUSE) metric. We elaborate on the well-known inconsistency in determining optimal calibration using the Expected Calibration Error (ECE) and we demonstrate similar issues for the AUSE, the Uncertainty Calibration Score (UCS), as well as the Uncertainty Calibration Error (UCE). We conclude that the current methodologies leave a degree of freedom, which prevents a unique model calibration for the homologation of safety-critical functionalities. Furthermore, we propose the AUSE as an indirect measure for the residual uncertainty, which is irreducible for a fixed network architecture and is driven by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSensor Technology and Measurement Systems
MethodsFocus
