MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching

Liang Yue; Yihong Tang; Kehai Chen; Jie Liu; Min Zhang

arXiv:2506.02689·cs.CL·June 5, 2025

MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching

Liang Yue, Yihong Tang, Kehai Chen, Jie Liu, Min Zhang

PDF

Open Access

TL;DR

MASTER introduces a multi-agent simulation approach to generate high-quality instruction data, significantly improving large language models' reasoning and multitask performance without extensive data collection.

Contribution

The paper presents a novel multi-agent interaction method for data augmentation, enhancing instruction fine-tuning datasets and model capabilities in NLP tasks.

Findings

01

Models fine-tuned with BOOST-QA outperform baselines on multiple benchmarks.

02

MASTER significantly boosts reasoning abilities in complex tasks.

03

The approach reduces data collection costs while improving model generalization.

Abstract

Instruction fine-tuning is crucial in NLP tasks, enhancing pretrained models' instruction-following capabilities and task-specific performance. However, obtaining high-quality fine-tuning data for large models is challenging due to data collection difficulties and high production costs. To address this, we propose MASTER, a novel data augmentation method that enriches original data through interactions among multiple agents with varying cognitive levels. We simulate three pedagogically grounded teaching scenarios, leveraging multi-agent conversations to generate high-quality teacher-student interaction data. Utilizing MASTER, we construct BOOST-QA, a fine-tuning dataset augmented from existing datasets like Orca-Math-200k, ProcQA, and OpenHermes2.5. Experiments show that models fine-tuned with BOOST-QA perform excellently across multiple benchmarks, demonstrating strong multitask…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Intelligent Tutoring Systems and Adaptive Learning