Small Updates, Big Doubts: Does Parameter-Efficient Fine-tuning Enhance Hallucination Detection ?
Xu Hu, Yifan Zhang, Songtao Wei, Chen Zhao, Qiannan Li, Bingzhe Li, Feng Chen

TL;DR
This study systematically evaluates how parameter-efficient fine-tuning (PEFT) affects hallucination detection in large language models, revealing that PEFT generally enhances detection performance by reshaping uncertainty representation.
Contribution
It provides the first comprehensive empirical analysis of PEFT's impact on hallucination detection across multiple models and benchmarks, highlighting its role in uncertainty encoding.
Findings
PEFT improves hallucination detection AUROC across detectors.
PEFT reshapes uncertainty encoding without adding factual knowledge.
The effect is consistent across different models and datasets.
Abstract
Parameter-efficient fine-tuning (PEFT) methods are widely used to adapt large language models (LLMs) to downstream tasks and are often assumed to improve factual correctness. However, how the parameter-efficient fine-tuning methods affect hallucination behavior remains insufficiently understood, especially on QA datasets. In this work, we systematically investigate the impact of PEFT on hallucination detection through a comprehensive empirical study across three open-weight LLM backbones and three fact-seeking QA benchmarks. For each model, we evaluate performance using seven unsupervised hallucination detection methods spanning three complementary approaches: semantic consistency based detectors, confidence based detectors, and entropy based detectors. This multifaceted evaluation enables us to characterize how PEFT reshapes uncertainty across different detection paradigms. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Misinformation and Its Impacts
