ReviewEval: An Evaluation Framework for AI-Generated Reviews

Madhav Krishan Garg; Tejash Prasad; Tanmay Singhal; Chhavi Kirtani; Murari Mandal; Dhruv Kumar

arXiv:2502.11736·cs.CL·May 27, 2025

ReviewEval: An Evaluation Framework for AI-Generated Reviews

Madhav Krishan Garg, Tejash Prasad, Tanmay Singhal, Chhavi Kirtani, Murari Mandal, Dhruv Kumar

PDF

Open Access 1 Video

TL;DR

This paper introduces ReviewEval, a comprehensive framework for evaluating AI-generated reviews, and ReviewAgent, an LLM-based review generator that improves review quality and alignment with human standards.

Contribution

It presents a novel evaluation framework and a review generation agent with alignment and self-refinement mechanisms, advancing AI's role in peer review processes.

Findings

01

ReviewAgent improves actionable insights by 6.78% and 47.62% over existing baselines and experts.

02

It enhances analytical depth by 3.97% and 12.73%.

03

It increases adherence to guidelines by 10.11% and 47.26%.

Abstract

The escalating volume of academic research, coupled with a shortage of qualified reviewers, necessitates innovative approaches to peer review. In this work, we propose: 1. ReviewEval, a comprehensive evaluation framework for AI-generated reviews that measures alignment with human assessments, verifies factual accuracy, assesses analytical depth, identifies degree of constructiveness and adherence to reviewer guidelines; and 2. ReviewAgent, an LLM-based review generation agent featuring a novel alignment mechanism to tailor feedback to target conferences and journals, along with a self-refinement loop that iteratively optimizes its intermediate outputs and an external improvement loop using ReviewEval to improve upon the final reviews. ReviewAgent improves actionable insights by 6.78% and 47.62% over existing AI baselines and expert reviews respectively. Further, it boosts analytical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ReviewEval: An Evaluation Framework for AI-Generated Reviews· underline

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)