Alignment Makes Language Models Normative, Not Descriptive

Eilam Shapira; Moshe Tennenholtz; Roi Reichart

arXiv:2603.17218·cs.CL·March 19, 2026

Alignment Makes Language Models Normative, Not Descriptive

Eilam Shapira, Moshe Tennenholtz, Roi Reichart

PDF

Open Access

TL;DR

Aligning language models enhances their ability to predict normative human decisions in simple, one-shot scenarios but reduces accuracy in complex, multi-round strategic interactions where descriptive behavior dominates.

Contribution

This study demonstrates that alignment induces a normative bias in language models, improving predictions in normative settings while impairing performance in descriptive, strategic contexts.

Findings

01

Aligned models outperform base models in normative, one-shot games.

02

Base models better predict human choices in multi-round strategic interactions.

03

Alignment causes a trade-off, favoring normative predictions over descriptive accuracy.

Abstract

Post-training alignment optimizes language models to match human preference signals, but this objective is not equivalent to modeling observed human behavior. We compare 120 base-aligned model pairs on more than 10,000 real human decisions in multi-round strategic games - bargaining, persuasion, negotiation, and repeated matrix games. In these settings, base models outperform their aligned counterparts in predicting human choices by nearly 10:1, robustly across model families, prompt formulations, and game configurations. This pattern reverses, however, in settings where human behavior is more likely to follow normative predictions: aligned models dominate on one-shot textbook games across all 12 types tested and on non-strategic lottery choices - and even within the multi-round games themselves, at round one, before interaction history develops. This boundary-condition pattern suggests…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExperimental Behavioral Economics Studies · Game Theory and Applications · Artificial Intelligence in Games