Beyond "Not Novel Enough": Enriching Scholarly Critique with LLM-Assisted Feedback
Osama Mohammed Afzal, Preslav Nakov, Tom Hope, and Iryna Gurevych

TL;DR
This paper introduces a structured, LLM-assisted method for automated novelty assessment in peer review, improving consistency and transparency in evaluating scholarly submissions, especially in high-volume fields like NLP.
Contribution
It presents a novel three-stage approach modeling expert reviewer behavior for automated novelty evaluation, outperforming existing LLM baselines and enhancing review consistency.
Findings
Achieves 86.5% alignment with human reasoning
Reaches 75.3% agreement on novelty conclusions
Outperforms existing LLM-based baselines
Abstract
Novelty assessment is a central yet understudied aspect of peer review, particularly in high volume fields like NLP where reviewer capacity is increasingly strained. We present a structured approach for automated novelty evaluation that models expert reviewer behavior through three stages: content extraction from submissions, retrieval and synthesis of related work, and structured comparison for evidence based assessment. Our method is informed by a large scale analysis of human written novelty reviews and captures key patterns such as independent claim verification and contextual reasoning. Evaluated on 182 ICLR 2025 submissions with human annotated reviewer novelty assessments, the approach achieves 86.5% alignment with human reasoning and 75.3% agreement on novelty conclusions - substantially outperforming existing LLM based baselines. The method produces detailed, literature aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExpert finding and Q&A systems · Topic Modeling · Academic integrity and plagiarism
