Mixture-of-Instructions: Aligning Large Language Models via Mixture Prompting
Bowen Xu, Shaoyu Wu, Kai Liu, Lulu Hu

TL;DR
This paper introduces Mixture-of-Instructions, a novel prompt strategy that improves large language model alignment across multiple tasks by using diverse system prompts and instruction packing, demonstrated on the Qwen-7B-chat model.
Contribution
The paper proposes Mixture-of-Instructions, a new technique combining instruction packing with diverse prompts to enhance multi-task alignment of large language models.
Findings
Qwen-SFT-MoI shows improved performance in coding, mathematics, and tool use tasks.
Diverse system prompts help prevent overfitting and improve inference.
Benchmark results indicate significant alignment efficiency gains.
Abstract
With the proliferation of large language models (LLMs), the comprehensive alignment of such models across multiple tasks has emerged as a critical area of research. Existing alignment methodologies primarily address single task, such as multi-turn dialogue, coding, mathematical problem-solving, and tool usage. Although there is a large amount of high-quality data available for those tasks, most of them provide only questions and answers without including the system prompt. Though a detailed analysis of the Qwen language model, we found that the system prompt has a significant impact on both training and inference processes of LLM. We attributes this phenomenon to overfitting to the system prompt. In address this issue, we introduce a novel technique termed Mixture-of-Instructions (MoI), which employs a strategy of instruction packing combined with diverse system prompts to boost the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsSparse Evolutionary Training
