Value Alignment Equilibrium in Multiagent Systems
Nieves Montes, Carles Sierra

TL;DR
This paper reviews a model of value alignment in multiagent systems, introduces the concepts of alignment equilibrium and Pareto optimal alignment, and demonstrates their application using the Iterated Prisoner's Dilemma.
Contribution
It extends existing frameworks by defining new equilibrium concepts that incorporate any modeled values, enhancing the understanding of value alignment in multiagent interactions.
Findings
Introduces alignment equilibrium and Pareto optimal alignment concepts.
Applies the framework to the Iterated Prisoner's Dilemma as a use-case.
Provides insights into achieving value alignment in multiagent systems.
Abstract
Value alignment has emerged in recent years as a basic principle to produce beneficial and mindful Artificial Intelligence systems. It mainly states that autonomous entities should behave in a way that is aligned with our human values. In this work, we summarize a previously developed model that considers values as preferences over states of the world and defines alignment between the governing norms and the values. We provide a use-case for this framework with the Iterated Prisoner's Dilemma model, which we use to exemplify the definitions we review. We take advantage of this use-case to introduce new concepts to be integrated with the established framework: alignment equilibrium and Pareto optimal alignment. These are inspired on the classical Nash equilibrium and Pareto optimality, but are designed to account for any value we wish to model in the system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
