Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
Yujian Liu, Shiyu Chang, Tommi Jaakkola, Yang Zhang

TL;DR
This paper introduces Prereq-Tune, a novel fine-tuning strategy that disentangles skills and knowledge learning in LLMs, reducing hallucinations and improving factuality by incorporating synthetic data and prerequisite learning stages.
Contribution
The paper proposes Prereq-Tune, a new fine-tuning method that separates skill and knowledge learning, enhancing LLM factuality and enabling knowledge-controlled generation.
Findings
Prereq-Tune outperforms existing methods in factuality tasks.
Synthetic data combined with Prereq-Tune improves grounding of LLM outputs.
Prereq-Tune reduces hallucinations in short and long-form generation.
Abstract
Recent studies have identified one aggravating factor of LLM hallucinations as the knowledge inconsistency between pre-training and fine-tuning, where unfamiliar fine-tuning data mislead the LLM to fabricate plausible but wrong outputs. In this paper, we propose a novel fine-tuning strategy called Prereq-Tune to address this knowledge inconsistency and reduce hallucinations. Fundamentally, Prereq-Tune disentangles the learning of skills and knowledge, so the model learns only the task skills without being impacted by the knowledge inconsistency. To achieve this, Prereq-Tune introduces an additional prerequisite learning stage to learn the necessary knowledge for SFT, allowing subsequent SFT to focus only on task skills. Prereq-Tune can also be combined with fictitious synthetic data to enhance the grounding of LLM outputs to their internal knowledge. Experiments show that Prereq-Tune…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
MethodsShrink and Fine-Tune · Focus
