The Necessity of Setting Temperature in LLM-as-a-Judge

Lujun Li; Lama Sleem; Yangjie Xu; Yewei Song; Aolin Jia; Jerome Francois; Radu State

arXiv:2603.28304·cs.CL·March 31, 2026

The Necessity of Setting Temperature in LLM-as-a-Judge

Lujun Li, Lama Sleem, Yangjie Xu, Yewei Song, Aolin Jia, Jerome Francois, Radu State

PDF

TL;DR

This paper investigates how temperature settings affect the performance of LLMs used as judges in evaluating text quality, revealing that temperature choice significantly impacts evaluation outcomes.

Contribution

It provides a systematic analysis of temperature effects on LLM judge performance using controlled experiments and causal inference methods.

Findings

01

Temperature significantly influences LLM judge behavior.

02

Lower temperatures do not always lead to better evaluation accuracy.

03

Task-dependent effects of temperature are observed in LLM judging performance.

Abstract

LLM-as-a-Judge has emerged as an effective and low-cost paradigm for evaluating text quality and factual correctness. Prior studies have shown substantial agreement between LLM judges and human experts, even on tasks that are difficult to assess automatically. In practice, researchers commonly employ fixed temperature configurations during the evaluation process-with values of 0.1 and 1.0 being the most prevalent choices-a convention that is largely empirical rather than principled. However, recent researches suggest that LLM performance exhibits non-trivial sensitivity to temperature settings, that lower temperatures do not universally yield optimal outcomes, and that such effects are highly task-dependent. This raises a critical research question: does temperature influence judge performance in LLM centric evaluation? To address this, we systematically investigate the relationship…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.