Are LLM Evaluators Really Narcissists? Sanity Checking Self-Preference Evaluations

Dani Roytburg; Matthew Bozoukov; Matthew Nguyen; Jou Barzdukas; Mackenzie Puig-Hall; Narmeen Oozeer

arXiv:2601.22548·cs.CL·February 13, 2026

Are LLM Evaluators Really Narcissists? Sanity Checking Self-Preference Evaluations

Dani Roytburg, Matthew Bozoukov, Matthew Nguyen, Jou Barzdukas, Mackenzie Puig-Hall, Narmeen Oozeer

PDF

Open Access

TL;DR

This paper investigates the self-preference bias in LLM evaluators, identifies a core confound affecting measurements, and introduces a baseline to improve the accuracy of bias detection in automated evaluations.

Contribution

It uncovers a key methodological confound in measuring LLM self-preference bias and proposes a baseline to decouple true bias from noisy responses, improving evaluation reliability.

Findings

01

Only 51% of initial bias findings remain significant after correction.

02

A core confound can reduce measurement error by 89.6%.

03

The baseline helps isolate genuine self-preference signals.

Abstract

Recent research has shown that large language models (LLMs) favor their own outputs when acting as judges, undermining the integrity of automated post-training and evaluation workflows. However, it is difficult to disentangle which evaluation biases are explained by narcissism versus general experimental confounds, distorting measurements of self-preference bias. We discover a core methodological confound which could reduce measurement error by 89.6%. Specifically, LLM evaluators may deliver self-preferring verdicts when the judge responds to queries which they completed incorrectly themselves; this would be true regardless of whether one of their responses is their own. To decouple self-preference signals from noisy outputs on hard problems, we introduce an Evaluator Quality Baseline, which compares the probability that a judge incorrectly votes for itself against the probability that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Ethics and Social Impacts of AI · Mobile Crowdsensing and Crowdsourcing