Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment
Nicholas Kluge Corr\^ea

TL;DR
This paper introduces a philosophical and technical framework called Dynamic Normativity, establishing necessary and sufficient conditions for aligning AI systems with human values, supported by practical implementations in language models.
Contribution
It formulates foundational necessary and sufficient conditions for AI value alignment, bridging philosophical theory with practical alignment methods.
Findings
Proposes necessary and sufficient conditions for AI alignment.
Develops a framework called Dynamic Normativity.
Demonstrates implementation with state-of-the-art language models.
Abstract
The critical inquiry pervading the realm of Philosophy, and perhaps extending its influence across all Humanities disciplines, revolves around the intricacies of morality and normativity. Surprisingly, in recent years, this thematic thread has woven its way into an unexpected domain, one not conventionally associated with pondering "what ought to be": the field of artificial intelligence (AI) research. Central to morality and AI, we find "alignment", a problem related to the challenges of expressing human goals and values in a manner that artificial systems can follow without leading to unwanted adversarial effects. More explicitly and with our current paradigm of AI development in mind, we can think of alignment as teaching human values to non-anthropomorphic entities trained through opaque, gradient-based learning techniques. This work addresses alignment as a technical-philosophical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic Theory and Institutions · Complex Systems and Decision Making
