Disentangling AI Alignment: A Structured Taxonomy Beyond Safety and Ethics
Kevin Baum

TL;DR
This paper proposes a structured taxonomy for AI alignment, clarifying the conceptual boundaries among safety, ethics, legality, and other goals to guide interdisciplinary research and practical implementation.
Contribution
It introduces a comprehensive taxonomy that distinguishes alignment aims, scope, and constituency, offering a foundational framework for integrating diverse AI alignment perspectives.
Findings
Reveals multiple legitimate alignment configurations
Provides a structured framework for understanding AI alignment
Clarifies the relationship between safety, ethics, and legality
Abstract
Recent advances in AI research make it increasingly plausible that artificial agents with consequential real-world impact will soon operate beyond tightly controlled environments. Ensuring that these agents are not only safe but that they adhere to broader normative expectations is thus an urgent interdisciplinary challenge. Multiple fields -- notably AI Safety, AI Alignment, and Machine Ethics -- claim to contribute to this task. However, the conceptual boundaries and interrelations among these domains remain vague, leaving researchers without clear guidance in positioning their work. To address this meta-challenge, we develop a structured conceptual framework for understanding AI alignment. Rather than focusing solely on alignment goals, we introduce a taxonomy distinguishing the alignment aim (safety, ethicality, legality, etc.), scope (outcome vs. execution), and constituency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
