A Statistical Case Against Empirical Human-AI Alignment

Julian Rodemann; Esteban Garces Arias; Christoph Luther; Christoph Jansen; Thomas Augustin

arXiv:2502.14581·cs.AI·May 13, 2025

A Statistical Case Against Empirical Human-AI Alignment

Julian Rodemann, Esteban Garces Arias, Christoph Luther, Christoph Jansen, Thomas Augustin

PDF

Open Access

TL;DR

This paper argues that naive empirical human-AI alignment can introduce biases and suggests alternative approaches, supported by examples like human-centric language model decoding.

Contribution

It presents a principled critique of empirical alignment and proposes prescriptive and a posteriori empirical alignment as better alternatives.

Findings

01

Naive empirical alignment can introduce statistical biases.

02

Prescriptive and a posteriori approaches mitigate bias issues.

03

Examples include human-centric decoding of language models.

Abstract

Empirical human-AI alignment aims to make AI systems act in line with observed human behavior. While noble in its goals, we argue that empirical alignment can inadvertently introduce statistical biases that warrant caution. This position paper thus advocates against naive empirical alignment, offering prescriptive alignment and a posteriori empirical alignment as alternatives. We substantiate our principled argument by tangible examples like human-centric decoding of language models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · AI-based Problem Solving and Planning · Explainable Artificial Intelligence (XAI)