Disentangling AI Alignment: A Structured Taxonomy Beyond Safety and Ethics

Kevin Baum

arXiv:2506.06286·cs.CY·June 10, 2025

Disentangling AI Alignment: A Structured Taxonomy Beyond Safety and Ethics

Kevin Baum

PDF

Open Access

TL;DR

This paper proposes a structured taxonomy for AI alignment, clarifying the conceptual boundaries among safety, ethics, legality, and other goals to guide interdisciplinary research and practical implementation.

Contribution

It introduces a comprehensive taxonomy that distinguishes alignment aims, scope, and constituency, offering a foundational framework for integrating diverse AI alignment perspectives.

Findings

01

Reveals multiple legitimate alignment configurations

02

Provides a structured framework for understanding AI alignment

03

Clarifies the relationship between safety, ethics, and legality

Abstract

Recent advances in AI research make it increasingly plausible that artificial agents with consequential real-world impact will soon operate beyond tightly controlled environments. Ensuring that these agents are not only safe but that they adhere to broader normative expectations is thus an urgent interdisciplinary challenge. Multiple fields -- notably AI Safety, AI Alignment, and Machine Ethics -- claim to contribute to this task. However, the conceptual boundaries and interrelations among these domains remain vague, leaving researchers without clear guidance in positioning their work. To address this meta-challenge, we develop a structured conceptual framework for understanding AI alignment. Rather than focusing solely on alignment goals, we introduce a taxonomy distinguishing the alignment aim (safety, ethicality, legality, etc.), scope (outcome vs. execution), and constituency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)