MMReview: A Multidisciplinary and Multimodal Benchmark for LLM-Based Peer Review Automation

Xian Gao; Jiacheng Ruan; Zongyun Zhang; Jingsheng Gao; Ting Liu; Yuzhuo Fu

arXiv:2508.14146·cs.CL·October 9, 2025

MMReview: A Multidisciplinary and Multimodal Benchmark for LLM-Based Peer Review Automation

Xian Gao, Jiacheng Ruan, Zongyun Zhang, Jingsheng Gao, Ting Liu, Yuzhuo Fu

PDF

Open Access

TL;DR

MMReview introduces a comprehensive, multimodal benchmark across multiple disciplines to evaluate LLMs' effectiveness in automating peer review, addressing current gaps in assessment standards.

Contribution

It presents the first unified benchmark with multimodal content and expert reviews for assessing LLMs in peer review tasks across diverse academic fields.

Findings

01

Open-source models show varying performance on the benchmark.

02

Advanced models outperform baseline models in review quality.

03

Benchmark reveals strengths and weaknesses of current LLMs in peer review tasks.

Abstract

With the rapid growth of academic publications, peer review has become an essential yet time-consuming responsibility within the research community. Large Language Models (LLMs) have increasingly been adopted to assist in the generation of review comments; however, current LLM-based review tasks lack a unified evaluation benchmark to rigorously assess the models' ability to produce comprehensive, accurate, and human-aligned assessments, particularly in scenarios involving multimodal content such as figures and tables. To address this gap, we propose \textbf{MMReview}, a comprehensive benchmark that spans multiple disciplines and modalities. MMReview includes multimodal content and expert-written review comments for 240 papers across 17 research domains within four major academic disciplines: Artificial Intelligence, Natural Sciences, Engineering Sciences, and Social Sciences. We design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies