Rethinking Vacuity for OOD Detection in Evidential Deep Learning
Claire McNamara

TL;DR
This paper reveals that vacuity-based OOD detection in Evidential Deep Learning is highly sensitive to class cardinality differences, which can lead to misleading evaluation results.
Contribution
It provides empirical and theoretical evidence that class cardinality differences significantly impact OOD detection metrics in EDL, highlighting an evaluation artefact.
Findings
AUROC and AUPR can differ substantially when ID and OOD class counts differ.
Evaluation metrics can be artificially inflated due to class cardinality differences.
The paper discusses implications for evaluating EDL in language models using MCQA datasets.
Abstract
Vacuity, or Uncertainty Mass (UM), is commonly used as a metric to evaluate Out-of-Distribution (OOD) detection in Evidential Deep Learning (EDL). It generally involves dividing the number of classes () by the total strength of belief () of the model's predictions, where is derived from summing the Dirichlet parameters. As such, UM is sensitive to the cardinality of . In particular, it is unlikely in practice that there is a linear relationship between and as and increase due to the nature of EDL (suppressing incorrectly assigned evidence). As a result, when comparing In Distribution (ID) and OOD results, it is important that and are equal; something that is not always ensured in practice. We provide an empirical demonstration of how results for AUROC and AUPR can substantially differ when class cardinality between ID and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
