Uncertainty-Aware Transformers: Conformal Prediction for Language Models
Abhiram Vellore, Niraj K. Jha

TL;DR
This paper introduces CONFIDE, a conformal prediction framework for transformer-based language models that provides statistically valid uncertainty estimates and interpretability, improving trustworthiness in critical applications.
Contribution
It presents a novel conformal prediction method for transformer embeddings, enabling reliable uncertainty quantification and interpretability in language models.
Findings
CONFIDE improves test accuracy by up to 4.09% on BERT-tiny.
It achieves greater correct efficiency compared to prior methods.
Early and intermediate transformer layers yield better-calibrated representations.
Abstract
Transformers have had a profound impact on the field of artificial intelligence, especially on large language models and their variants. However, as was the case with neural networks, their black-box nature limits trust and deployment in high-stakes settings. For models to be genuinely useful and trustworthy in critical applications, they must provide more than just predictions: they must supply users with a clear understanding of the reasoning that underpins their decisions. This article presents an uncertainty quantification framework for transformer-based language models. This framework, called CONFIDE (CONformal prediction for FIne-tuned DEep language models), applies conformal prediction to the internal embeddings of encoder-only architectures, like BERT and RoBERTa, while enabling hyperparameter tuning. CONFIDE uses either [CLS] token embeddings or flattened hidden states to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
