LogitDynamics: Reliable ViT Error Detection from Layerwise Logit Trajectories

Ido Beigelman; Moti Freiman

arXiv:2604.10643·cs.CV·April 14, 2026

LogitDynamics: Reliable ViT Error Detection from Layerwise Logit Trajectories

Ido Beigelman, Moti Freiman

PDF

TL;DR

This paper introduces LogitDynamics, a method using layerwise logit trajectories in Vision Transformers to reliably predict errors with minimal computation, improving confidence estimation in image classification.

Contribution

It presents a simple, effective approach that models class evidence evolution across ViT layers using lightweight heads, enhancing error prediction and generalization.

Findings

01

Improves AUCPR over baselines across datasets

02

Shows stronger cross-dataset generalization

03

Requires minimal additional computation

Abstract

Reliable confidence estimation is critical when deploying vision models. We study error prediction: determining whether an image classifier's output is correct using only signals from a single forward pass. Motivated by internal-signal hallucination detection in large language models, we investigate whether similar depth-wise signals exist in Vision Transformers (ViTs). We propose a simple method that models how class evidence evolves across layers. By attaching lightweight linear heads to intermediate layers, we extract features from the last L layers that capture both the logits of the predicted class and its top-K competitors, as well as statistics describing instability of top-ranked classes across depth. A linear probe trained on these features predicts the error indicator. Across datasets, our method improves or matches AUCPR over baselines and shows stronger cross-dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.