NLPeer: A Unified Resource for the Computational Study of Peer Review
Nils Dycke, Ilia Kuznetsov, Iryna Gurevych

TL;DR
NLPeer provides a comprehensive, multi-domain dataset of peer reviews and papers, enabling systematic NLP research to improve peer review processes with new tasks and structured data.
Contribution
Introduces NLPeer, the first ethically sourced, multi-domain peer review dataset with structured representations, supporting systematic NLP research and new reviewing assistance tasks.
Findings
Created a dataset of over 5,000 papers and 11,000 reviews from five venues.
Established a unified data representation and augmented existing datasets.
Implemented and analyzed three peer review assistance tasks, including a novel guided skimming.
Abstract
Peer review constitutes a core component of scholarly publishing; yet it demands substantial expertise and training, and is susceptible to errors and biases. Various applications of NLP for peer reviewing assistance aim to support reviewers in this complex process, but the lack of clearly licensed datasets and multi-domain corpora prevent the systematic study of NLP for peer review. To remedy this, we introduce NLPeer -- the first ethically sourced multidomain corpus of more than 5k papers and 11k review reports from five different venues. In addition to the new datasets of paper drafts, camera-ready versions and peer reviews from the NLP community, we establish a unified data representation and augment previous peer review datasets to include parsed and structured paper representations, rich metadata and versioning information. We complement our resource with implementations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Wikis in Education and Collaboration · Natural Language Processing Techniques
