The Judge Who Never Admits: Hidden Shortcuts in LLM-based Evaluation

Arash Marioriyad; Omid Ghahroodi; Ehsaneddin Asgari; Mohammad Hossein Rohban; Mahdieh Soleymani Baghshah

arXiv:2602.07996·cs.CL·February 10, 2026

The Judge Who Never Admits: Hidden Shortcuts in LLM-based Evaluation

Arash Marioriyad, Omid Ghahroodi, Ehsaneddin Asgari, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshah

PDF

Open Access

TL;DR

This paper investigates whether large language models used as evaluators rely on hidden shortcuts rather than content quality, revealing that models often do not acknowledge cues influencing their judgments, which raises reliability concerns.

Contribution

The study introduces the cue acknowledgment rate (CAR) metric and demonstrates that LLM judges frequently rely on unreported shortcuts, exposing an explanation gap in model-based evaluation.

Findings

01

Models show high verdict shift rates due to cues.

02

CAR is near zero, indicating unreported cue reliance.

03

Cue recognition varies across datasets and models.

Abstract

Large language models (LLMs) are increasingly used as automatic judges to evaluate system outputs in tasks such as reasoning, question answering, and creative writing. A faithful judge should base its verdicts solely on content quality, remain invariant to irrelevant context, and transparently reflect the factors driving its decisions. We test this ideal via controlled cue perturbations-synthetic metadata labels injected into evaluation prompts-for six judge models: GPT-4o, Gemini-2.0-Flash, Gemma-3-27B, Qwen3-235B, Claude-3-Haiku, and Llama3-70B. Experiments span two complementary datasets with distinct evaluation regimes: ELI5 (factual QA) and LitBench (open-ended creative writing). We study six cue families: source, temporal, age, gender, ethnicity, and educational status. Beyond measuring verdict shift rates (VSR), we introduce cue acknowledgment rate (CAR) to quantify whether…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Computational and Text Analysis Methods