Am I More Pointwise or Pairwise? Revealing Position Bias in Rubric-Based LLM-as-a-Judge

Yuzheng Xu; Tosho Hirasawa; Tadashi Kozuno; Yoshitaka Ushiku

arXiv:2602.02219·cs.CL·February 3, 2026

Am I More Pointwise or Pairwise? Revealing Position Bias in Rubric-Based LLM-as-a-Judge

Yuzheng Xu, Tosho Hirasawa, Tadashi Kozuno, Yoshitaka Ushiku

PDF

Open Access

TL;DR

This paper uncovers position bias in rubric-based LLM evaluation, showing that score options' positions influence judgments, and proposes a permutation strategy to mitigate bias and improve reliability.

Contribution

The study reveals position bias in rubric-based LLM evaluation and introduces a permutation method to reduce bias and enhance correlation with human judgments.

Findings

01

Position bias is consistent across models and datasets.

02

Balanced permutation reduces position bias and improves evaluation accuracy.

03

Permutation-based calibration enhances alignment with human judgments.

Abstract

Large language models (LLMs) are now widely used to evaluate the quality of text, a field commonly referred to as LLM-as-a-judge. While prior works mainly focus on point-wise and pair-wise evaluation paradigms. Rubric-based evaluation, where LLMs select a score from multiple rubrics, has received less analysis. In this work, we show that rubric-based evaluation implicitly resembles a multi-choice setting and therefore has position bias: LLMs prefer score options appearing at specific positions in the rubric list. Through controlled experiments across multiple models and datasets, we demonstrate consistent position bias. To mitigate this bias, we propose a balanced permutation strategy that evenly distributes each score option across positions. We show that aggregating scores across balanced permutations not only reveals latent position bias, but also improves correlation between the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Authorship Attribution and Profiling