Evolving Interpretable Constitutions for Multi-Agent Coordination
Ujwal Kumar, Alice Saito, Hershraj Niranjani, Rayan Yessou, Phan Xuan Tan

TL;DR
This paper introduces a framework for evolving interpretable behavioral norms in multi-agent systems, demonstrating that cooperative rules can be automatically discovered to improve social welfare and reduce conflict.
Contribution
It presents a novel evolutionary approach to automatically discover effective and interpretable constitutions for multi-agent coordination, surpassing human-designed norms.
Findings
Evolved constitutions significantly increase societal stability scores.
Cooperative norms discovered reduce communication and conflict.
Evolved rules outperform human-designed and baseline strategies.
Abstract
Constitutional AI has focused on single-model alignment using fixed principles. However, multi-agent systems create novel alignment challenges through emergent social dynamics. We present Constitutional Evolution, a framework for automatically discovering behavioral norms in multi-agent LLM systems. Using a grid-world simulation with survival pressure, we study the tension between individual and collective welfare, quantified via a Societal Stability Score S in [0,1] that combines productivity, survival, and conflict metrics. Adversarial constitutions lead to societal collapse (S= 0), while vague prosocial principles ("be helpful, harmless, honest") produce inconsistent coordination (S = 0.249). Even constitutions designed by Claude 4.5 Opus with explicit knowledge of the objective achieve only moderate performance (S= 0.332). Using LLM-driven genetic programming with multi-island…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
