ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks
Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R. Fabbri, Irene, Li, Dan Friedman, Dragomir R. Radev

TL;DR
This paper introduces a large annotated corpus for scientific paper summarization and proposes hybrid summarization methods that incorporate both abstracts and citation impacts to produce more comprehensive summaries.
Contribution
The paper presents the first large-scale manually-annotated corpus for scientific papers and introduces hybrid summarization models combining abstracts and citation impacts.
Findings
Hybrid summaries outperform traditional citation-based summaries.
The annotated corpus improves training of summarization models.
Hybrid methods effectively capture research impact.
Abstract
Scientific article summarization is challenging: large, annotated corpora are not available, and the summary should ideally include the article's impacts on research community. This paper provides novel solutions to these two challenges. We 1) develop and release the first large-scale manually-annotated corpus for scientific papers (on computational linguistics) by enabling faster annotation, and 2) propose summarization methods that integrate the authors' original highlights (abstract) and the article's actual impacts on the community (citations), to create comprehensive, hybrid summaries. We conduct experiments to demonstrate the efficacy of our corpus in training data-driven models for scientific paper summarization and the advantage of our hybrid summaries over abstracts and traditional citation-based summaries. Our large annotated corpus and hybrid methods provide a new framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Advanced Text Analysis Techniques
