Pre-review to Peer review: Pitfalls of Automating Reviews using Large Language Models
Akhil Pandey Akella, Harish Varma Siravuri, Shaurya Rohatgi

TL;DR
This study evaluates the potential and risks of using large language models to automate peer reviews, finding they can assist in pre-review screening but exhibit misalignment and overconfidence issues compared to human reviewers.
Contribution
The paper provides an experimental analysis of frontier open-weight LLMs for peer review, highlighting their utility and pitfalls, and introduces an open-source dataset for further research.
Findings
LLMs show weak correlation with human reviews (0.15)
Models tend to overestimate review quality by 3-5 points
LLM reviews correlate more with post-publication metrics than human scores
Abstract
Large Language Models are versatile general-task solvers, and their capabilities can truly assist people with scholarly peer review as \textit{pre-review} agents, if not as fully autonomous \textit{peer-review} agents. While incredibly beneficial, automating academic peer-review, as a concept, raises concerns surrounding safety, research integrity, and the validity of the academic peer-review process. The majority of the studies performing a systematic evaluation of frontier LLMs generating reviews across science disciplines miss the mark on addressing the alignment/misalignment of reviews along with the utility of LLM generated reviews when compared against publication outcomes such as \textbf{Citations}, \textbf{Hit-papers}, \textbf{Novelty}, and \textbf{Disruption}. This paper presents an experimental study in which we gathered ground-truth reviewer ratings from OpenReview and used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExpert finding and Q&A systems · Academic integrity and plagiarism · Academic Publishing and Open Access
