PeerSum: A Peer Review Dataset for Abstractive Multi-document   Summarization

Miao Li; Jianzhong Qi; Jey Han Lau

arXiv:2203.01769·cs.IR·September 30, 2022·1 cites

PeerSum: A Peer Review Dataset for Abstractive Multi-document Summarization

Miao Li, Jianzhong Qi, Jey Han Lau

PDF

Open Access 1 Repo

TL;DR

PeerSum is a novel multi-document summarization dataset based on peer reviews of scientific papers, featuring highly abstractive summaries and source disagreements, challenging current models and opening new research avenues.

Contribution

Introduces PeerSum, a unique dataset with real, abstractive summaries and source disagreements, advancing multi-document summarization research.

Findings

01

State-of-the-art models underperform on PeerSum

02

Dataset highlights challenges in abstractive summarization

03

Opens new research directions for MDS models

Abstract

We present PeerSum, a new MDS dataset using peer reviews of scientific publications. Our dataset differs from the existing MDS datasets in that our summaries (i.e., the meta-reviews) are highly abstractive and they are real summaries of the source documents (i.e., the reviews) and it also features disagreements among source documents. We found that current state-of-the-art MDS models struggle to generate high-quality summaries for PeerSum, offering new research opportunities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oaimli/peersum
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Biomedical Text Mining and Ontologies