Instruction Position Matters in Sequence Generation with Large Language Models
Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie Zhou

TL;DR
This paper shows that repositioning task instructions after input sentences in training data enhances large language models' ability to follow instructions, especially in long sequences, improving zero-shot translation and summarization performance.
Contribution
The paper introduces a simple method of shifting instructions after input sentences to improve instruction-following in LLMs, backed by theoretical analysis and extensive experiments.
Findings
Improved zero-shot translation performance, up to 9.7 BLEU points.
Consistent gains across model scales and tasks.
No additional data or annotation required.
Abstract
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization, through instruction fine-tuning. The fine-tuning data is generally sequentially concatenated from a specific task instruction, an input sentence, and the corresponding response. Considering the locality modeled by the self-attention mechanism of LLMs, these models face the risk of instruction forgetting when generating responses for long input sentences. To mitigate this issue, we propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences. Theoretical analysis suggests that our straightforward method can alter the model's learning focus, thereby emphasizing the training of instruction-following capabilities. Concurrently, experimental results demonstrate that our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
