Beyond Preferences: Learning Alignment Principles Grounded in Human Reasons and Values

Henry Bell; Lara Neubauer da Costa Schertel; Bochu Ding; Brandon Fain

arXiv:2601.18760·cs.LG·January 27, 2026

Beyond Preferences: Learning Alignment Principles Grounded in Human Reasons and Values

Henry Bell, Lara Neubauer da Costa Schertel, Bochu Ding, Brandon Fain

PDF

Open Access

TL;DR

This paper introduces Grounded Constitutional AI (GCAI), a framework that creates AI alignment principles based on human reasons and values, improving fairness, moral grounding, and human preference alignment.

Contribution

It extends the ICAI approach by incorporating human-provided reasons and values to generate more representative and morally grounded AI constitutions.

Findings

01

GCAI-generated constitutions are preferred by humans over ICAI ones.

02

Participants find GCAI constitutions more morally grounded and coherent.

03

GCAI effectively combines general principles and contextual preferences.

Abstract

A crucial consideration when developing and deploying Large Language Models (LLMs) is the human values to which these models are aligned. In the constitutional framework of alignment models are aligned to a set of principles (the constitution) specified in natural language. However, it is unclear how to fairly determine this constitution with widespread stakeholder input. In this work we propose Grounded Constitutional AI (GCAI), a unified framework for generating constitutions of principles that are representative of both users' general expectations toward AI (general principles) and their interaction-time preferences (contextual principles). We extend the Inverse Constitutional AI (ICAI) approach to generate contextual principles from human preference annotation data by leveraging human-provided \textit{reasons} for their preferences. We supplement these contextual principles with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI