MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching
Liang Yue, Yihong Tang, Kehai Chen, Jie Liu, Min Zhang

TL;DR
MASTER introduces a multi-agent simulation approach to generate high-quality instruction data, significantly improving large language models' reasoning and multitask performance without extensive data collection.
Contribution
The paper presents a novel multi-agent interaction method for data augmentation, enhancing instruction fine-tuning datasets and model capabilities in NLP tasks.
Findings
Models fine-tuned with BOOST-QA outperform baselines on multiple benchmarks.
MASTER significantly boosts reasoning abilities in complex tasks.
The approach reduces data collection costs while improving model generalization.
Abstract
Instruction fine-tuning is crucial in NLP tasks, enhancing pretrained models' instruction-following capabilities and task-specific performance. However, obtaining high-quality fine-tuning data for large models is challenging due to data collection difficulties and high production costs. To address this, we propose MASTER, a novel data augmentation method that enriches original data through interactions among multiple agents with varying cognitive levels. We simulate three pedagogically grounded teaching scenarios, leveraging multi-agent conversations to generate high-quality teacher-student interaction data. Utilizing MASTER, we construct BOOST-QA, a fine-tuning dataset augmented from existing datasets like Orca-Math-200k, ProcQA, and OpenHermes2.5. Experiments show that models fine-tuned with BOOST-QA perform excellently across multiple benchmarks, demonstrating strong multitask…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Intelligent Tutoring Systems and Adaptive Learning
