Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning

Tianyi Wu; Jingwei Ni; Bryan Hooi; Jiaheng Zhang; Elliott Ash; See-Kiong Ng; Mrinmaya Sachan; Markus Leippold

arXiv:2502.11962·cs.CL·June 26, 2025

Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning

Tianyi Wu, Jingwei Ni, Bryan Hooi, Jiaheng Zhang, Elliott Ash, See-Kiong Ng, Mrinmaya Sachan, Markus Leippold

PDF

Open Access

TL;DR

This paper explores the challenge of balancing truthfulness and informativeness in instruction fine-tuning of large language models, proposing two new methods to improve truthfulness and manage uncertainty.

Contribution

It introduces two novel IFT paradigms, $UNIT_{cut}$ and $UNIT_{ref}$, to enhance truthfulness and uncertainty recognition in LLMs, addressing the trade-off issue.

Findings

01

$UNIT_{cut}$ improves model truthfulness.

02

$UNIT_{ref}$ maintains informativeness and reduces hallucinations.

03

Unfamiliar knowledge in datasets can harm truthfulness.

Abstract

Instruction fine-tuning (IFT) can increase the informativeness of large language models (LLMs), but may reduce their truthfulness. This trade-off arises because IFT steers LLMs to generate responses containing long-tail knowledge that was not well covered during pre-training. As a result, models become more informative but less accurate when generalizing to unseen tasks. In this paper, we empirically demonstrate how unfamiliar knowledge in IFT datasets can negatively affect the truthfulness of LLMs, and we introduce two new IFT paradigms, $U N I T_{c u t}$ and $U N I T_{r e f}$ , to address this issue. $U N I T_{c u t}$ identifies and removes unfamiliar knowledge from IFT datasets to mitigate its impact on model truthfulness, whereas $U N I T_{r e f}$ trains LLMs to recognize their uncertainty and explicitly indicate it at the end of their responses. Our experiments show that $U N I T_{c u t}$ substantially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning