Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future
Sihong Wu, Owen Jiang, Yilun Zhao, Tiansheng Hu, Yiling Ma, Kaiyan Zhang, Manasi Patwardhan, Arman Cohan

TL;DR
This survey explores how large language models can assist or automate various stages of the peer review process, including review generation, rebuttals, meta-reviews, and evaluation methods.
Contribution
It synthesizes current techniques, datasets, and evaluation strategies for integrating LLMs into the peer review workflow, highlighting limitations and future directions.
Findings
Catalogs datasets and compares modeling choices.
Discusses limitations and ethical concerns.
Provides guidance for building LLM-based review systems.
Abstract
Peer review is a multi-stage process involving reviews, rebuttals, meta-reviews, final decisions, and subsequent manuscript revisions. Recent advances in large language models (LLMs) have motivated methods that assist or automate different stages of this pipeline. In this survey, we synthesize techniques for (i) peer review generation, including fine-tuning strategies, agent-based systems, RL-based methods, and emerging paradigms to enhance generation; (ii) after-review tasks including rebuttals, meta-review and revision aligned to reviews; and (iii) evaluation methods spanning human-centered, reference-based, LLM-based and aspect-oriented. We catalog datasets, compare modeling choices, and discuss limitations, ethical concerns, and future directions. The survey aims to provide practical guidance for building, evaluating, and integrating LLM systems across the full peer review workflow.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
