Exploring the Effects of Alignment on Numerical Bias in Large Language Models

Ayako Sato; Hwichan Kim; Zhousi Chen; Masato Mita; Mamoru Komachi

arXiv:2601.16444·cs.CL·January 27, 2026

Exploring the Effects of Alignment on Numerical Bias in Large Language Models

Ayako Sato, Hwichan Kim, Zhousi Chen, Masato Mita, Mamoru Komachi

PDF

Open Access 1 Video

TL;DR

This paper investigates how alignment techniques in large language models cause numerical bias in evaluation scores, and proposes score range adjustment as an effective mitigation strategy.

Contribution

It identifies the link between alignment and increased numerical bias in LLM evaluators and evaluates mitigation methods, highlighting score range adjustment as most effective.

Findings

01

Alignment increases numerical bias in LLM evaluators.

02

Score range adjustment reduces bias and improves evaluation performance.

03

Mitigation strategies need further refinement for robustness.

Abstract

"LLM-as-a-judge," which utilizes large language models (LLMs) as evaluators, has proven effective in many evaluation tasks. However, evaluator LLMs exhibit numerical bias, a phenomenon where certain evaluation scores are generated disproportionately often, leading reduced evaluation performance. This study investigates the cause of this bias. Given that most evaluator LLMs are aligned through instruction tuning and preference tuning, and that prior research suggests alignment reduces output diversity, we hypothesize that numerical bias arises from alignment. To test this, we compare outputs from pre- and post-alignment LLMs, and observe that alignment indeed increases numerical bias. We also explore mitigation strategies for post-alignment LLMs, including temperature scaling, distribution calibration, and score range adjustment. Among these, score range adjustment is most effective in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Exploring the Effects of Alignment on Numerical Bias in Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Explainable Artificial Intelligence (XAI)