Probing Routing-Conditional Calibration in Attention-Residual Transformers
Wenhao Liang, Lin Yue, Wei Emma Zhang, Miao Xu, Mingyu Guo, Olaf Maennel, Weitong Chen

TL;DR
This paper investigates whether routing-specific internal traces in Attention-Residual transformers provide stable evidence for calibration, finding that they do not offer reliable routing-conditional calibration signals beyond confidence measures.
Contribution
It introduces a diagnostic suite and controls to rigorously evaluate routing-conditional calibration, revealing that routing traces do not reliably indicate calibration issues.
Findings
Routing summaries do not show stable evidence of miscalibration.
Routing-aware calibration gains are not significant after controls.
Matched-confidence and permutation controls eliminate apparent routing-based calibration signals.
Abstract
Post-hoc calibration is usually evaluated as a function of logits or softmax confidence alone, even as routing-augmented architectures increasingly accompany predictions with sample-specific internal routing traces and pair them with claims of calibration-relevant uncertainty. We ask a basic question: do these traces provide stable routing-specific evidence for post-hoc calibration beyond confidence? We study this in Attention-Residual transformers (Kimi Team, 2026) through a matched-confidence diagnostic suite that stratifies examples by routing-derived state, compares subgroup gaps against within-bin routing-permutation nulls, and evaluates matched post-hoc probes differing only in their auxiliary feature. Across our completed AR runs, scalar routing summaries do not provide stable evidence of routing-conditional miscalibration: weighted gaps remain small or seed-sensitive, and only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
