Beyond Arrow's Impossibility: Fairness as an Emergent Property of Multi-Agent Collaboration
Sayan Kumar Chaki, Antoine Gourru, Julien Velcin

TL;DR
This paper explores how fairness emerges from multi-agent interactions in language models, showing that negotiation and alignment influence fairness outcomes and that collective deliberation navigates inherent impossibility constraints.
Contribution
It introduces a framework for studying fairness as an emergent property of agent interactions, highlighting the role of negotiation and alignment in fairness outcomes.
Findings
Aligned agents moderate bias through contestation rather than override
Joint allocations can satisfy fairness criteria not achievable individually
Intrinsic biases persist even in explicitly aligned agents
Abstract
Fairness in language models is typically studied as a property of a single, centrally optimized model. As large language models become increasingly agentic, we propose that fairness emerges through interaction and exchange. We study this via a controlled hospital triage framework in which two agents negotiate over three structured debate rounds. One agent is aligned to a specific ethical framework via retrieval-augmented generation (RAG), while the other is either unaligned or adversarially prompted to favor demographic groups over clinical need. We find that alignment systematically shapes negotiation strategies and allocation patterns, and that neither agent's allocation is ethically adequate in isolation, yet their joint final allocation can satisfy fairness criteria that neither would have reached alone. Aligned agents partially moderate bias through contestation rather than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
