MAC: Multi-Agent Constitution Learning
Rushil Thareja, Gautam Gupta, Francesco Pinto, Nils Lukas

TL;DR
MAC introduces a structured multi-agent approach to automatically learn and optimize rules for controlling large language models, improving interpretability and performance without extensive labeled data.
Contribution
The paper presents MAC, a novel multi-agent framework for learning structured rules for LLM control, and MAC+ which enhances training efficiency and effectiveness.
Findings
MAC outperforms recent prompt optimization methods by over 50%
Produces human-readable, auditable rule sets
Achieves performance comparable to supervised fine-tuning and GRPO
Abstract
Constitutional AI is a method to oversee and control LLMs based on a set of rules written in natural language. These rules are typically written by human experts, but could in principle be learned automatically given sufficient training data for the desired behavior. Existing LLM-based prompt optimizers attempt this but are ineffective at learning constitutions since (i) they require many labeled examples and (ii) lack structure in the optimized prompts, leading to diminishing improvements as prompt size grows. To address these limitations, we propose Multi-Agent Constitutional Learning (MAC), which optimizes over structured prompts represented as sets of rules using a network of agents with specialized tasks to accept, edit, or reject rule updates. We also present MAC+, which improves performance by training agents on successful trajectories to reinforce updates leading to higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · AI-based Problem Solving and Planning · Machine Learning and Data Classification
