Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System
Weize Chen, Jiarui Yuan, Chen Qian, Cheng Yang, Zhiyuan Liu, Maosong, Sun

TL;DR
Optima is a novel framework that significantly improves communication efficiency and task effectiveness in LLM-based multi-agent systems through iterative training and advanced optimization techniques.
Contribution
It introduces a new training paradigm for LLM-based MAS that enhances scalability, efficiency, and effectiveness using iterative generate-rank-select-train cycles and diverse RL algorithms.
Findings
Up to 2.8x performance improvement over baselines
Achieves less than 10% token usage on complex tasks
Enhances inference scalability and efficiency
Abstract
Large Language Model (LLM) based multi-agent systems (MAS) show remarkable potential in collaborative problem-solving, yet they still face critical challenges: low communication efficiency, poor scalability, and a lack of effective parameter-updating optimization methods. We present Optima, a novel framework that addresses these issues by significantly enhancing both communication efficiency and task effectiveness in LLM-based MAS through LLM training. Optima employs an iterative generate, rank, select, and train paradigm with a reward function balancing task performance, token efficiency, and communication readability. We explore various RL algorithms, including Supervised Fine-Tuning, Direct Preference Optimization, and their hybrid approaches, providing insights into their effectiveness-efficiency trade-offs. We integrate Monte Carlo Tree Search-inspired techniques for DPO data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
MethodsDirect Preference Optimization · LLaMA · Mixing Adam and SGD
