# Failure Prediction Is a Better Performance Proxy for Early-Exit Networks Than Calibration

**Authors:** Piotr Kubaty, Filip Szatkowski, Metod Jazbec, Bartosz W\'ojcik

arXiv: 2508.21495 · 2025-09-09

## TL;DR

This paper argues that failure prediction is a more reliable indicator than calibration for assessing early-exit networks' performance, as it better correlates with efficiency improvements.

## Contribution

The authors demonstrate that calibration metrics can be misleading and propose failure prediction as a superior proxy for early-exit model performance evaluation.

## Key findings

- Miscalibrated networks can outperform calibrated ones.
- Failure prediction correlates strongly with efficiency gains.
- Calibration metrics may not reflect true model performance.

## Abstract

Early-exit models accelerate inference by attaching internal classifiers to intermediate layers of the network, allowing computation to halt once a prediction meets a predefined exit criterion. Most early-exit methods rely on confidence-based exit strategies, which has motivated prior work to calibrate intermediate classifiers in pursuit of improved performance-efficiency trade-offs. In this paper, we argue that calibration metrics can be misleading indicators of multi-exit model performance. Specifically, we present empirical evidence showing that miscalibrated networks can outperform calibrated ones. As an alternative, we propose using failure prediction as a more informative proxy for early-exit model performance. Unlike calibration, failure prediction captures changes in sample rankings and correlates strongly with efficiency gains, offering a more reliable framework for designing and evaluating early-exit models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21495/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21495/full.md

## References

58 references — full list in the complete paper: https://tomesphere.com/paper/2508.21495/full.md

---
Source: https://tomesphere.com/paper/2508.21495