FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness

Xiaoning Dong; Chengyan Wu; Yajie Wen; Yu Chen; Yun Xue; Jing Zhang; Wei Xu; Bolei Ma

arXiv:2604.10189·cs.CL·April 14, 2026

FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness

Xiaoning Dong, Chengyan Wu, Yajie Wen, Yu Chen, Yun Xue, Jing Zhang, Wei Xu, Bolei Ma

PDF

TL;DR

FAITH is a post-training framework that improves LLM factuality by integrating trustworthiness and honestness signals, external knowledge retrieval, and a reward-based fine-tuning process.

Contribution

It introduces a novel natural-language based approach to align LLMs' internal trust and honesty with external knowledge, enhancing factual accuracy.

Findings

01

FAITH improves factual accuracy on four benchmarks.

02

The retrieval module increases consistency between internal and external knowledge.

03

Reward-based fine-tuning enhances truthfulness of LLM outputs.

Abstract

Large Language Models (LLMs) can generate factually inaccurate content even if they have corresponding knowledge, which critically undermines their reliability. Existing approaches attempt to mitigate this by incorporating uncertainty in QA prompt during training, but these numerical scores lack the semantic richness for LLM to properly understand its internal states of trustworthiness and honestness, leading to insufficient factuality alignment. We introduce FAITH (Factuality Alignment through Integrating Trustworthiness and Honestness), a post-training framework for factuality alignment that integrates natural-language uncertainty signals with external knowledge. Specifically, we augment training datasets by computing confidence scores and semantic entropy from LLM outputs and mapping them into a knowledge state quadrant that describes the model's internal knowledge possession…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.