TruthFlow: Truthful LLM Generation via Representation Flow Correction
Hanyu Wang, Bochuan Cao, Yuanpu Cao, Jinghui Chen

TL;DR
TruthFlow introduces a flow-based, query-specific correction method that significantly enhances the truthfulness of large language model outputs, outperforming universal correction techniques across multiple benchmarks.
Contribution
It proposes a novel flow matching approach for query-specific representation correction to improve LLM truthfulness, addressing limitations of universal correction methods.
Findings
Significantly improves truthfulness on open-ended generation tasks.
Demonstrates strong transferability to unseen hallucination benchmarks.
Outperforms existing universal correction techniques.
Abstract
Large language models (LLMs) are known to struggle with consistently generating truthful responses. While various representation intervention techniques have been proposed, these methods typically apply a universal representation correction vector to all input queries, limiting their effectiveness against diverse queries in practice. In this study, we introduce TruthFlow, a novel method that leverages the Flow Matching technique for query-specific truthful representation correction. Specifically, TruthFlow first uses a flow model to learn query-specific correction vectors that transition representations from hallucinated to truthful states. Then, during inference, the trained flow model generates these correction vectors to enhance the truthfulness of LLM outputs. Experimental results demonstrate that TruthFlow significantly improves performance on open-ended generation tasks across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Law
