Relative Principals, Pluralistic Alignment, and the Structural Value Alignment Problem
Travis LaCroix

TL;DR
This paper reframes AI value alignment as a governance issue involving objectives, information, and stakeholders, emphasizing ongoing institutional management over purely technical solutions.
Contribution
It introduces a three-axis framework for diagnosing AI misalignment, highlighting the importance of governance and pluralism in alignment efforts.
Findings
Alignment involves trade-offs among competing values.
Misalignment can occur along objectives, information, and stakeholder interests.
Effective alignment requires institutional processes, not just technical fixes.
Abstract
The value alignment problem for artificial intelligence (AI) is often framed as a purely technical or normative challenge, sometimes focused on hypothetical future systems. I argue that the problem is better understood as a structural question about governance: not whether an AI system is aligned in the abstract, but whether it is aligned enough, for whom, and at what cost. Drawing on the principal-agent framework from economics, this paper reconceptualises misalignment as arising along three interacting axes: objectives, information, and principals. The three-axis framework provides a systematic way of diagnosing why misalignment arises in real-world systems and clarifies that alignment cannot be treated as a single technical property of models but an outcome shaped by how objectives are specified, how information is distributed, and whose interests count in practice. The core…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
