ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews
Mike D'Arcy, Alexis Ross, Erin Bransom, Bailey Kuehl, Jonathan Bragg,, Tom Hope, Doug Downey

TL;DR
This paper introduces ARIES, a dataset of scientific paper revisions made in response to peer reviews, and evaluates models' ability to link comments to edits and generate appropriate revisions, revealing current limitations.
Contribution
The paper presents ARIES, a new dataset for studying automated paper revision based on peer feedback, and analyzes model performance and GPT-4's capabilities in this task.
Findings
Models struggle to identify comment-edit relationships, especially with indirect reasoning.
GPT-4 often produces superficial edits that lack technical depth.
High inter-annotator agreement on the expert-labeled test set.
Abstract
We introduce the task of automatically revising scientific papers based on peer feedback and release ARIES, a dataset of review comments and their corresponding paper edits. The data is drawn from real reviewer-author interactions from computer science, and we provide labels linking each reviewer comment to the specific paper edits made by the author in response. We automatically create a high-precision silver training set, as well as an expert-labeled test set that shows high inter-annotator agreement. In experiments with 10 models covering the state of the art, we find that they struggle even to identify which edits correspond to a comment -- especially when the relationship between the edit and the comment is indirect and requires reasoning to uncover. We also extensively analyze GPT-4's ability to generate edits given a comment and the original paper. We find that it often succeeds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Expert finding and Q&A systems
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Layer Normalization · Label Smoothing · Adam · Byte Pair Encoding · Residual Connection
