ALIEN: Aligned Entropy Head for Improving Uncertainty Estimation of LLMs
Artem Zabolotnyi, Roman Makarov, Mile Mitrovic, Polina Proskura, Oleg Travkin, Roman Alferov, Alexey Zaytsev

TL;DR
ALIEN is a lightweight method that refines entropy-based uncertainty estimates in large language models by aligning them with prediction reliability, improving detection of incorrect predictions with minimal overhead.
Contribution
The paper introduces ALIEN, a novel lightweight uncertainty head that enhances entropy-based uncertainty estimation by supervised alignment, applicable across various models and tasks.
Findings
ALIEN outperforms strong baselines in detecting incorrect predictions.
It achieves the lowest calibration error across multiple datasets and models.
The method adds minimal inference overhead and parameter increase.
Abstract
Uncertainty estimation remains a key challenge when adapting pre-trained language models to downstream classification tasks, with overconfidence often observed for difficult inputs. While predictive entropy provides a strong baseline for uncertainty estimation, it considers mainly aleatoric uncertainty and has limited capacity to capture effects, such as class overlap or ambiguous linguistic cues. We introduce Aligned Entropy - ALIEN, a lightweight method that refines entropy-based uncertainty by aligning it with prediction reliability. ALIEN trains a small uncertainty head initialized to produce the model's original entropy and subsequently fine-tuned with two regularization mechanisms. Experiments across seven classification datasets and two NER benchmarks, evaluated on five language models (RoBERTa, ELECTRA, LLaMA-2, Qwen2.5, and Qwen3), show that ALIEN consistently outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
