AMEL: Accumulated Message Effects on LLM Judgments
Sid-ali Temkit

TL;DR
This study investigates how prior conversation polarity biases large language models' judgments, revealing a persistent negativity bias and that context length does not mitigate this effect, impacting evaluation reliability.
Contribution
It provides the first large-scale empirical evidence of the accumulated message effect (AMEL) across multiple models and providers, highlighting the bias's characteristics and potential mitigation strategies.
Findings
Models shift toward the prevailing conversation polarity (d = -0.17).
Negative histories induce 1.62x more bias than positive ones.
Bias does not increase with longer context lengths.
Abstract
Large language models are routinely used as automated evaluators: to review code, moderate content, or score outputs, often with many items passing through one conversation. We ask whether the polarity of prior conversation history biases subsequent judgments, an effect we call the accumulated message effect on LLM judgments (AMEL). Across 75,898 API calls to 11 models from 4 providers (OpenAI, Anthropic, Google, and four open-source models), we present identical test items in isolation or following histories saturated with predominantly positive or negative evaluations. Models shift toward the conversation's prevailing polarity (d = -0.17, p < 10^-46). The effect concentrates on items where the model is genuinely uncertain at baseline (d = -0.34 for high-entropy items, vs d = -0.15 when the baseline is deterministic). Bias does not grow with context length: 5 prior turns and 50…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
