Evaluating and Boosting Uncertainty Quantification in Classification
Xiaoyang Huang, Jiancheng Yang, Linguo Li, Haoran Deng and, Bingbing Ni, Yi Xu

TL;DR
This paper introduces a new evaluation metric, AUCCC, for uncertainty quantification in classification, and proposes a distillation method, UDist, to enhance UQ performance, validated on natural and medical images.
Contribution
It presents a unified, robust metric for UQ evaluation and a simple distillation scheme to improve UQ accuracy in classification tasks.
Findings
AUCCC effectively evaluates UQ models.
UDist consistently improves UQ performance.
Method outperforms baselines on diverse datasets.
Abstract
Emergence of artificial intelligence techniques in biomedical applications urges the researchers to pay more attention on the uncertainty quantification (UQ) in machine-assisted medical decision making. For classification tasks, prior studies on UQ are difficult to compare with each other, due to the lack of a unified quantitative evaluation metric. Considering that well-performing UQ models ought to know when the classification models act incorrectly, we design a new evaluation metric, area under Confidence-Classification Characteristic curves (AUCCC), to quantitatively evaluate the performance of the UQ models. AUCCC is threshold-free, robust to perturbation, and insensitive to the classification performance. We evaluate several UQ methods (e.g., max softmax output) with AUCCC to validate its effectiveness. Furthermore, a simple scheme, named Uncertainty Distillation (UDist), is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Imbalanced Data Classification Techniques
MethodsSoftmax
