UAT-LITE: Inference-Time Uncertainty-Aware Attention for Pretrained Transformers

Elias Hossain; Shubhashis Roy Dipta; Subash Neupane; Rajib Rana; Ravid Shwartz-Ziv; Ivan Garibay; Niloofar Yousefi

arXiv:2602.02952·cs.AI·March 11, 2026

UAT-LITE: Inference-Time Uncertainty-Aware Attention for Pretrained Transformers

Elias Hossain, Shubhashis Roy Dipta, Subash Neupane, Rajib Rana, Ravid Shwartz-Ziv, Ivan Garibay, Niloofar Yousefi

PDF

Open Access

TL;DR

UAT-LITE introduces an inference-time method to incorporate uncertainty into transformer attention mechanisms, improving calibration and selective prediction without retraining, by injecting epistemic uncertainty through Monte Carlo dropout.

Contribution

It proposes a novel inference-time framework that injects uncertainty directly into self-attention, enhancing model calibration and interpretability without altering pretrained weights.

Findings

01

Reduces expected calibration error by ~20% on benchmarks.

02

Maintains accuracy while improving uncertainty estimation.

03

Enhances selective prediction under distribution shifts.

Abstract

Neural NLP models are often miscalibrated and overconfident, assigning high confidence to incorrect predictions and failing to express uncertainty during internal evidence aggregation. This undermines selective prediction and high-stakes deployment. Post-hoc calibration methods adjust output probabilities but leave internal computation unchanged, while ensemble and Bayesian approaches improve uncertainty at substantial training or storage cost. We propose UAT-LITE, an inference-time framework that makes self-attention uncertainty-aware via Monte Carlo dropout in pretrained transformer classifiers. Unlike output-level calibration (e.g., TS), UAT-LITE injects epistemic uncertainty directly into attention, enabling uncertainty-aware routing during contextualization and token-level diagnostic signals beyond global logit rescaling. Token-level epistemic uncertainty is estimated from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications