Causal Foundations of Collective Agency
Frederik Hytting J{\o}rgensen, Sebastian Weichwald, Lewis Hammond

TL;DR
This paper introduces a causal framework to determine when groups of agents can be considered as unified collective agents, aiding understanding and control of emergent behaviors in multi-agent AI systems.
Contribution
It formalizes collective agency using causal games and abstraction, providing a new behavioral perspective and tools for analyzing multi-agent interactions.
Findings
Solved a multi-agent incentive puzzle in actor-critic models.
Quantitatively assessed collective agency in voting mechanisms.
Provided a foundation for predicting emergent collective behaviors.
Abstract
A key challenge for the safety of advanced AI systems is the possibility that multiple simpler agents might inadvertently form a collective agent with capabilities and goals distinct from those of any individual. More generally, determining when a group of agents can be viewed as a unified collective agent is a foundational question in the study of interactions and incentives in both biological and artificial systems. We adopt a behavioral perspective in answering this question, ascribing collective agency to a group when viewing the group's joint actions as rational and goal-directed successfully predicts its behavior. We formalize this perspective on collective agency using causal games -- which are causal models of strategic, multi-agent interactions -- and causal abstraction -- which formalizes when a simple, high-level model faithfully captures a more complex, low-level model. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
