CALMA: A Process for Deriving Context-aligned Axes for Language Model Alignment
Prajna Soni, Deepika Raman, Dylan Hadfield-Menell

TL;DR
CALMA is a participatory methodology that derives context-specific axes for evaluating and aligning language models, addressing the limitations of broad, Western-centric benchmarks by incorporating diverse community priorities.
Contribution
The paper introduces CALMA, a novel grounded approach for eliciting context-relevant axes for language model evaluation and alignment, emphasizing participatory and open-ended processes.
Findings
CALMA surfaced community-specific priorities absent from standard benchmarks.
Evaluation practices based on CALMA are more open-ended and use-case-driven.
The methodology enhances pluralistic and transparent alignment pipelines.
Abstract
Datasets play a central role in AI governance by enabling both evaluation (measuring capabilities) and alignment (enforcing values) along axes such as helpfulness, harmlessness, toxicity, quality, and more. However, most alignment and evaluation datasets depend on researcher-defined or developer-defined axes curated from non-representative samples. As a result, developers typically benchmark models against broad (often Western-centric) values that overlook the varied contexts of their real-world deployment. Consequently, models trained on such proxies can fail to meet the needs and expectations of diverse user communities within these deployment contexts. To bridge this gap, we introduce CALMA (Context-aligned Axes for Language Model Alignment), a grounded, participatory methodology for eliciting context-relevant axes for evaluation and alignment. In a pilot with two distinct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
