Impact of large language models on peer review opinions from a fine-grained perspective: Evidence from top conference proceedings in AI
Wenqing Wu, Chengzhi Zhang, Yi Zhao, Tong Bao

TL;DR
This study analyzes how large language models influence peer review reports, revealing increased length and fluency but decreased focus on deep evaluative aspects, affecting review quality and decision-making.
Contribution
It provides a systematic, fine-grained analysis of linguistic and evaluative changes in peer reviews due to LLMs, using automated annotation and statistical methods.
Findings
Peer reviews have become longer and more fluent post-LLM emergence.
Focus on surface-level clarity has increased, while critical evaluative aspects have declined.
Standardized linguistic patterns are more common in reviews with lower confidence scores.
Abstract
With the rapid advancement of Large Language Models (LLMs), the academic community has faced unprecedented disruptions, particularly in the realm of academic communication. The primary function of peer review is improving the quality of academic manuscripts, such as clarity, originality and other evaluation aspects. Although prior studies suggest that LLMs are beginning to influence peer review, it remains unclear whether they are altering its core evaluative functions. Moreover, the extent to which LLMs affect the linguistic form, evaluative focus, and recommendation-related signals of peer-review reports has yet to be systematically examined. In this study, we examine the changes in peer review reports for academic articles following the emergence of LLMs, emphasizing variations at fine-grained level. Specifically, we investigate linguistic features such as the length and complexity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
