From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

Xuan Gong; Hanbo Huang; Shiyu Liang

arXiv:2505.23410·cs.CL·May 30, 2025

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

Xuan Gong, Hanbo Huang, Shiyu Liang

PDF

Open Access

TL;DR

This paper investigates how the factuality gap in fine-tuned large language models can be mitigated at inference time through prompts and in-context learning, revealing the dominant role of test prompts in knowledge extraction.

Contribution

It demonstrates that test-time prompts and in-context learning can reduce the factuality gap caused by fine-tuning, supported by theoretical insights from knowledge graphs.

Findings

01

Test-time prompts diminish the impact of fine-tuning data on factuality.

02

In-context learning effectively compensates for limited or poor fine-tuning data.

03

Theoretical analysis shows prompts can overshadow fine-tuning effects in knowledge extraction.

Abstract

Factual knowledge extraction aims to explicitly extract knowledge parameterized in pre-trained language models for application in downstream tasks. While prior work has been investigating the impact of supervised fine-tuning data on the factuality of large language models (LLMs), its mechanism remains poorly understood. We revisit this impact through systematic experiments, with a particular focus on the factuality gap that arises when fine-tuning on known versus unknown knowledge. Our findings show that this gap can be mitigated at the inference stage, either under out-of-distribution (OOD) settings or by using appropriate in-context learning (ICL) prompts (i.e., few-shot learning and Chain of Thought (CoT)). We prove this phenomenon theoretically from the perspective of knowledge graphs, showing that the test-time prompt may diminish or even overshadow the impact of fine-tuning data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education

MethodsFocus