Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback

Vijay Keswani; Cyrus Cousins; Breanna Nguyen; Vincent Conitzer; Hoda Heidari; Jana Schaich Borg; and Walter Sinnott-Armstrong

arXiv:2511.10032·cs.HC·November 14, 2025

Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback

Vijay Keswani, Cyrus Cousins, Breanna Nguyen, Vincent Conitzer, Hoda Heidari, Jana Schaich Borg, and Walter Sinnott-Armstrong

PDF

Open Access 1 Video

TL;DR

This paper investigates how human moral preferences evolve over time and the implications for AI alignment, revealing that preferences are often unstable and that current models struggle to adapt, raising challenges for trustworthy AI in high-stakes domains.

Contribution

It provides empirical evidence of moral preference instability over time and analyzes its impact on AI alignment, emphasizing the need to account for dynamic human values.

Findings

01

Participants change responses 6-20% of the time across sessions.

02

Significant shifts observed in participants' decision models over time.

03

Predictive performance of AI models decreases with preference and model instability.

Abstract

Alignment methods in moral domains seek to elicit moral preferences of human stakeholders and incorporate them into AI. This presupposes moral preferences as static targets, but such preferences often evolve over time. Proper alignment of AI to dynamic human preferences should ideally account for "legitimate" changes to moral reasoning, while ignoring changes related to attention deficits, cognitive biases, or other arbitrary factors. However, common AI alignment approaches largely neglect temporal changes in preferences, posing serious challenges to proper alignment, especially in high-stakes applications of AI, e.g., in healthcare domains, where misalignment can jeopardize the trustworthiness of the system and yield serious individual and societal harms. This work investigates the extent to which people's moral preferences change over time, and the impact of such changes on AI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Moral Change or Noise? On Problems of Aligning AI with Temporally Unstable Human Feedback· underline

Taxonomy

TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education