$ECUAS_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

Lautaro Estienne; Erik Ernst; Mat\'ias Vera; Pablo Piantanida; Luciana Ferrer

arXiv:2605.20490·cs.AI·May 22, 2026

$ECUAS_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

Lautaro Estienne, Erik Ernst, Mat\'ias Vera, Pablo Piantanida, Luciana Ferrer

PDF

TL;DR

This paper introduces a new family of metrics, $ECUAS_n$, designed to evaluate uncertainty-augmented systems holistically for decision-making tasks, addressing limitations of existing evaluation methods.

Contribution

The authors propose the $ECUAS_n$ metrics as a principled, flexible evaluation framework for uncertainty-augmented systems, with theoretical justification and empirical validation.

Findings

01

$ECUAS_n$ metrics effectively balance prediction accuracy and uncertainty quality.

02

The metrics outperform existing evaluation methods in diverse datasets.

03

Experiments on TriviaQA demonstrate practical applicability.

Abstract

In high-stakes automated decision-making, access to predictive uncertainty is essential for enabling users -- human or downstream systems -- to accept or reject predictions based on application-specific cost trade-offs. Such uncertainty-augmented (UA) systems -- i.e., systems that output both predictions and uncertainty scores -- are currently being assessed in the literature in a variety of ways, using separate metrics to evaluate the predictions and the uncertainty scores, setting a cost function with a fixed rejection cost or integrating over a coverage-risk curve. We argue that these evaluation approaches are inadequate for assessing overall performance of the UA system for decision making under uncertainty and propose a novel family of metrics, $E C U A S_{n}$ , formulated as proper scoring rules for the task of interest. The parameter $n$ controls the trade-off between the cost of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.