Instruction-tuned Language Models are Better Knowledge Learners
Zhengbao Jiang, Zhiqing Sun, Weijia Shi, Pedro Rodriguez, Chunting, Zhou, Graham Neubig, Xi Victoria Lin, Wen-tau Yih, Srinivasan Iyer

TL;DR
This paper introduces pre-instruction-tuning (PIT), a novel method where language models are instruction-tuned on questions before training on documents, significantly improving their ability to learn and answer questions from complex new data.
Contribution
The paper proposes pre-instruction-tuning (PIT), a new training paradigm that enhances knowledge absorption in LLMs by reversing the standard instruction-tuning sequence.
Findings
PIT outperforms standard instruction-tuning by 17.8% in knowledge absorption.
Instruction-tuning on questions before documents improves question-answering ability.
LLMs trained with PIT better handle complex, woven factual information.
Abstract
In order for large language model (LLM)-based assistants to effectively adapt to evolving information needs, it must be possible to update their factual knowledge through continued training on new data. The standard recipe for doing so involves continued pre-training on new documents followed by instruction-tuning on question-answer (QA) pairs. However, we find that LLMs trained with this recipe struggle to answer questions, even though the perplexity of documents is minimized. We found that QA pairs are generally straightforward, while documents are more complex, weaving many factual statements together in an intricate manner. Therefore, we hypothesize that it is beneficial to expose LLMs to QA pairs before continued pre-training on documents so that the process of encoding knowledge from complex documents takes into account how this knowledge is accessed through questions. Based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
