Modeling LLM Agent Reviewer Dynamics in Elo-Ranked Review System

Hsiang-Wei Huang; Junbin Lu; Kuang-Ming Chen; Jenq-Neng Hwang

arXiv:2601.08829·cs.CL·January 14, 2026

Modeling LLM Agent Reviewer Dynamics in Elo-Ranked Review System

Hsiang-Wei Huang, Junbin Lu, Kuang-Ming Chen, Jenq-Neng Hwang

PDF

Open Access

TL;DR

This paper investigates how Large Language Model (LLM) agent reviewers behave in an Elo-ranked conference review system, revealing that Elo ratings enhance decision accuracy and influence reviewer strategies without increasing effort.

Contribution

It introduces a simulation framework for LLM reviewer dynamics in Elo-ranked review systems, demonstrating the impact of Elo on decision accuracy and reviewer behavior.

Findings

01

Elo improves Area Chair decision accuracy.

02

Reviewers adapt strategies to exploit Elo ratings.

03

Elo does not increase review effort.

Abstract

In this work, we explore the Large Language Model (LLM) agent reviewer dynamics in an Elo-ranked review system using real-world conference paper submissions. Multiple LLM agent reviewers with different personas are engage in multi round review interactions moderated by an Area Chair. We compare a baseline setting with conditions that incorporate Elo ratings and reviewer memory. Our simulation results showcase several interesting findings, including how incorporating Elo improves Area Chair decision accuracy, as well as reviewers' adaptive review strategy that exploits our Elo system without improving review effort. Our code is available at https://github.com/hsiangwei0903/EloReview.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExpert finding and Q&A systems · Topic Modeling · Sentiment Analysis and Opinion Mining