PeerSum: A Peer Review Dataset for Abstractive Multi-document Summarization
Miao Li, Jianzhong Qi, Jey Han Lau

TL;DR
PeerSum is a novel multi-document summarization dataset based on peer reviews of scientific papers, featuring highly abstractive summaries and source disagreements, challenging current models and opening new research avenues.
Contribution
Introduces PeerSum, a unique dataset with real, abstractive summaries and source disagreements, advancing multi-document summarization research.
Findings
State-of-the-art models underperform on PeerSum
Dataset highlights challenges in abstractive summarization
Opens new research directions for MDS models
Abstract
We present PeerSum, a new MDS dataset using peer reviews of scientific publications. Our dataset differs from the existing MDS datasets in that our summaries (i.e., the meta-reviews) are highly abstractive and they are real summaries of the source documents (i.e., the reviews) and it also features disagreements among source documents. We found that current state-of-the-art MDS models struggle to generate high-quality summaries for PeerSum, offering new research opportunities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Biomedical Text Mining and Ontologies
