From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs
Xuan Gong, Hanbo Huang, Shiyu Liang

TL;DR
This paper investigates how the factuality gap in fine-tuned large language models can be mitigated at inference time through prompts and in-context learning, revealing the dominant role of test prompts in knowledge extraction.
Contribution
It demonstrates that test-time prompts and in-context learning can reduce the factuality gap caused by fine-tuning, supported by theoretical insights from knowledge graphs.
Findings
Test-time prompts diminish the impact of fine-tuning data on factuality.
In-context learning effectively compensates for limited or poor fine-tuning data.
Theoretical analysis shows prompts can overshadow fine-tuning effects in knowledge extraction.
Abstract
Factual knowledge extraction aims to explicitly extract knowledge parameterized in pre-trained language models for application in downstream tasks. While prior work has been investigating the impact of supervised fine-tuning data on the factuality of large language models (LLMs), its mechanism remains poorly understood. We revisit this impact through systematic experiments, with a particular focus on the factuality gap that arises when fine-tuning on known versus unknown knowledge. Our findings show that this gap can be mitigated at the inference stage, either under out-of-distribution (OOD) settings or by using appropriate in-context learning (ICL) prompts (i.e., few-shot learning and Chain of Thought (CoT)). We prove this phenomenon theoretically from the perspective of knowledge graphs, showing that the test-time prompt may diminish or even overshadow the impact of fine-tuning data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
MethodsFocus
