Should I State or Should I Show? Aligning AI with Human Preferences

Keaton Ellis; Wanying Huang

arXiv:2603.29317·econ.GN·April 1, 2026

Should I State or Should I Show? Aligning AI with Human Preferences

Keaton Ellis, Wanying Huang

PDF

TL;DR

This study compares stated and revealed preferences in training AI to align with human choices, finding revealed preferences generally lead to more accurate predictions of human behavior.

Contribution

It provides empirical evidence that revealed preferences improve AI alignment with human choices over stated preferences, highlighting implementation challenges.

Findings

01

Revealed preferences yield more accurate AI predictions of human choices.

02

Subjects struggle to accurately translate preferences into written prompts.

03

When conflicting, AI aligns more with prompts despite lower accuracy.

Abstract

As AI agents become more autonomous, properly aligning their objectives with human preferences becomes increasingly important. We study how effectively an AI agent learns a human principal's preference in choice under risk via stated versus revealed preferences. We conduct an online experiment in which subjects state their preferences through written instructions ("prompts") and reveal them through choices in a series of binary lottery questions ("data"). We find that on average, an AI agent given revealed-preference data predicts subjects' choices more accurately than an AI agent given stated-preference prompts. Further analysis suggests that the gap is driven by subjects' difficulty in translating their own preferences into written instructions. When given a choice between which information source to give to an AI agent, a large portion of subjects fail to select the more informative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.