Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value

Joe Edelman; Tan Zhi-Xuan; Ryan Lowe; Oliver Klingefjord; Vincent Wang-Mascianica; Matija Franklin; Ryan Othniel Kearns; Ellie Hain; Atrisha Sarkar; Michiel Bakker; Fazl Barez; David Duvenaud; Jakob Foerster; Iason Gabriel; Joseph Gubbels; Bryce Goodman; Andreas Haupt; Jobst Heitzig; Julian Jara-Ettinger; Atoosa Kasirzadeh; James Ravi Kirkpatrick; Andrew Koh; W. Bradley Knox; Philipp Koralus; Joel Lehman; Sydney Levine; Samuele Marro; Manon Revel; Toby Shorin; Morgan Sutherland; Michael Henry Tessler; Ivan Vendrov; James Wilken-Smith

arXiv:2512.03399·cs.LG·December 4, 2025

Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value

Joe Edelman, Tan Zhi-Xuan, Ryan Lowe, Oliver Klingefjord, Vincent Wang-Mascianica, Matija Franklin, Ryan Othniel Kearns, Ellie Hain, Atrisha Sarkar, Michiel Bakker, Fazl Barez, David Duvenaud, Jakob Foerster, Iason Gabriel, Joseph Gubbels, Bryce Goodman, Andreas Haupt

PDF

Open Access

TL;DR

This paper advocates for full-stack alignment of AI and societal institutions using thick models of value to ensure beneficial outcomes across complex social systems.

Contribution

It introduces the concept of thick models of value for better normative reasoning and demonstrates their application in AI stewardship, negotiation, economics, and regulation.

Findings

01

Thick models of value improve normative reasoning in AI systems.

02

Application of thick models enhances societal decision-making processes.

03

Full-stack alignment can mitigate misaligned institutional goals.

Abstract

Beneficial societal outcomes cannot be guaranteed by aligning individual AI systems with the intentions of their operators or users. Even an AI system that is perfectly aligned to the intentions of its operating organization can lead to bad outcomes if the goals of that organization are misaligned with those of other institutions and individuals. For this reason, we need full-stack alignment, the concurrent alignment of AI systems and the institutions that shape them with what people value. This can be done without imposing a particular vision of individual or collective flourishing. We argue that current approaches for representing values, such as utility functions, preference orderings, or unstructured text, struggle to address these and other issues effectively. They struggle to distinguish values from other signals, to support principled normative reasoning, and to model collective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Innovation, Sustainability, Human-Machine Systems