UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models

Boyang Xue; Fei Mi; Qi Zhu; Hongru Wang; Rui Wang; Sheng Wang; Erxin Yu; Xuming Hu; and Kam-Fai Wong

arXiv:2412.11803·cs.CL·May 26, 2025

UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models

Boyang Xue, Fei Mi, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Erxin Yu, Xuming Hu, and Kam-Fai Wong

PDF

Open Access 1 Repo 1 Video

TL;DR

UAlign introduces a novel framework that uses uncertainty estimations to improve large language models' ability to accurately express and align with factual knowledge, enhancing reliability and generalizability.

Contribution

The paper proposes a new method that leverages uncertainty estimations as input features for factuality alignment in LLMs, using a reward model and PPO training.

Findings

01

Significantly improves LLMs' factual answering accuracy

02

Enhances model confidence and refusal of unknown questions

03

Demonstrates robustness across in-domain and out-of-domain tasks

Abstract

Despite demonstrating impressive capabilities, Large Language Models (LLMs) still often struggle to accurately express the factual knowledge they possess, especially in cases where the LLMs' knowledge boundaries are ambiguous. To improve LLMs' factual expressions, we propose the UAlign framework, which leverages Uncertainty estimations to represent knowledge boundaries, and then explicitly incorporates these representations as input features into prompts for LLMs to Align with factual knowledge. First, we prepare the dataset on knowledge question-answering (QA) samples by calculating two uncertainty estimations, including confidence score and semantic entropy, to represent the knowledge boundaries for LLMs. Subsequently, using the prepared dataset, we train a reward model that incorporates uncertainty estimations and then employ the Proximal Policy Optimization (PPO) algorithm for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amourwaltz/ualign
pytorchOfficial

Videos

UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods

MethodsALIGN