Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision

Qian Ruan; Ilia Kuznetsov; Iryna Gurevych

arXiv:2406.00197·cs.CL·January 30, 2026

Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision

Qian Ruan, Ilia Kuznetsov, Iryna Gurevych

PDF

Open Access 1 Video

TL;DR

Re3 introduces a comprehensive framework and dataset for analyzing collaborative document revision, focusing on scientific papers, to improve understanding and automation of review and editing processes using NLP.

Contribution

The paper presents Re3, a novel holistic framework and a large annotated dataset for modeling collaborative document revision in the scholarly domain.

Findings

01

First empirical insights into academic collaborative revision

02

Assessment of state-of-the-art LLMs for edit analysis

03

Public availability of data and tools

Abstract

Collaborative review and revision of textual documents is the core of knowledge work and a promising target for empirical analysis and NLP assistance. Yet, a holistic framework that would allow modeling complex relationships between document revisions, reviews and author responses is lacking. To address this gap, we introduce Re3, a framework for joint analysis of collaborative document revision. We instantiate this framework in the scholarly domain, and present Re3-Sci, a large corpus of aligned scientific paper revisions manually labeled according to their action and intent, and supplemented with the respective peer reviews and human-written edit summaries. We use the new data to provide first empirical insights into collaborative document revision in the academic domain, and to assess the capabilities of state-of-the-art LLMs at automating edit analysis and facilitating text-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision· underline

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques