Calibration Attention: Learning Reliability-Aware Representations for Vision Transformers

Wenhao Liang; Wei Emma Zhang; Lin Yue; Miao Xu; Mingyu Guo; Olaf Maennel; Weitong Chen

arXiv:2508.08547·cs.CV·January 21, 2026

Calibration Attention: Learning Reliability-Aware Representations for Vision Transformers

Wenhao Liang, Wei Emma Zhang, Lin Yue, Miao Xu, Mingyu Guo, Olaf Maennel, Weitong Chen

PDF

Open Access

TL;DR

Calibration Attention introduces a representation-aware calibration method for vision transformers that improves uncertainty estimation by reshaping internal representations rather than just adjusting output logits, leading to better calibration and trustworthiness.

Contribution

The paper proposes Calibration Attention (CalAttn), a novel representation-level calibration module that couples instance-wise temperature scaling with transformer token geometry for improved uncertainty estimation.

Findings

01

CalAttn significantly reduces calibration error across multiple datasets.

02

It maintains or improves the accuracy of vision transformers.

03

CalAttn adds negligible computational overhead.

Abstract

Most calibration methods operate at the logit level, implicitly assuming that miscalibration can be corrected without changing the underlying representation. We challenge this assumption and propose \textbf{Calibration Attention (CalAttn)}, a \emph{representation-aware} calibration module for vision transformers that couples instance-wise temperature scaling to transformer token geometry under a proper scoring objective. CalAttn predicts a sample-specific temperature from the \texttt{[CLS]} token and backpropagates calibration gradients into the backbone, thereby reshaping the uncertainty structure of the representation rather than post-hoc adjusting confidence. This yields \emph{token-conditioned uncertainty modulation} with negligible overhead (\(<0.1\%\) additional parameters). Across multiple datasets with ViT/DeiT/Swin backbones, CalAttn consistently improves calibration while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Advanced Memory and Neural Computing