Do LLMs Favor LLMs? Quantifying Interaction Effects in Peer Review
Vibhhu Sharma, Thorsten Joachims, Sarah Dean

TL;DR
This study investigates how large language models influence peer review, revealing that LLMs tend to be more lenient with lower-quality papers and that their use affects decision-making processes in scientific publishing.
Contribution
It provides the first comprehensive analysis of LLM interaction effects in peer review, highlighting biases and implications for policy and decision-making.
Findings
LLM-assisted reviews are more lenient towards lower-quality papers.
Fully LLM-generated reviews show severe rating compression.
LLM-assisted metareviews are more likely to recommend acceptance.
Abstract
There are increasing indications that LLMs are not only used for producing scientific papers, but also as part of the peer review process. In this work, we provide the first comprehensive analysis of LLM use across the peer review pipeline, with particular attention to interaction effects: not just whether LLM-assisted papers or LLM-assisted reviews are different in isolation, but whether LLM-assisted reviews evaluate LLM-assisted papers differently. In particular, we analyze over 125,000 paper-review pairs from ICLR, NeurIPS, and ICML. We initially observe what appears to be a systematic interaction effect: LLM-assisted reviews seem especially kind to LLM-assisted papers compared to papers with minimal LLM use. However, controlling for paper quality reveals a different story: LLM-assisted reviews are simply more lenient toward lower quality papers in general, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic Publishing and Open Access · Expert finding and Q&A systems · scientometrics and bibliometrics research
