Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic
Shuo Liu, Tianle Chen, Ryan Amiri, Christopher Amato

TL;DR
This paper introduces Multi-Agent Actor-Critic methods for decentralized LLM collaboration, analyzing their benefits and limitations across various tasks, and providing practical approaches with experimental validation.
Contribution
It proposes two MAAC approaches, CoLLM-CC and CoLLM-DC, for decentralized LLM collaboration, and analyzes their performance relative to Monte Carlo methods.
Findings
CoLLM-DC performs comparably to CoLLM-CC in short-horizon, dense-reward tasks.
Monte Carlo methods require more samples and underperform on long-horizon, sparse-reward tasks.
CoLLM-CC outperforms both in long-horizon or sparse-reward settings.
Abstract
Recent work has explored optimizing LLM collaboration through Multi-Agent Reinforcement Learning (MARL). However, most MARL fine-tuning approaches rely on predefined execution protocols, which often require centralized execution. Decentralized LLM collaboration is more appealing in practice, as agents can run inference in parallel with flexible deployments. Also, current approaches use Monte Carlo methods for fine-tuning, which suffer from high variance and thus require more samples to train effectively. Actor-critic methods are prevalent in MARL for dealing with these issues, so we developed Multi-Agent Actor-Critic (MAAC) methods to optimize decentralized LLM collaboration. In this paper, we analyze when and why these MAAC methods are beneficial. We propose 2 MAAC approaches, \textbf{CoLLM-CC} with a \textbf{C}entralized \textbf{C}ritic and \textbf{CoLLM-DC} with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
