On the Composition of Scientific Abstracts
Iana Atanassova, Marc Bertin, Vincent Larivi\`ere

TL;DR
This study analyzes the composition of scientific abstracts by examining sentence similarity with research articles, revealing that most abstracts reuse sentences mainly from the introduction and conclusion sections across multiple journals.
Contribution
It provides a large-scale analysis of how abstracts are constructed by quantifying sentence reuse and identifying common source sections within research articles.
Findings
84% of abstracts share at least one sentence with the main article
Abstract sentences predominantly originate from the beginning of introductions and end of conclusions
The source of abstract sentences is consistent across different PLOS journals
Abstract
Scientific abstracts contain what is considered by the author(s) as information that best describe documents' content. They represent a compressed view of the informational content of a document and allow readers to evaluate the relevance of the document to a particular information need. However, little is known on their composition. This paper contributes to the understanding of the structure of abstracts, by comparing similarity between scientific abstracts and the text content of research articles. More specifically, using sentence-based similarity metrics, we quantify the phenomenon of text re-use in abstracts and examine the positions of the sentences that are similar to sentences in abstracts in the IMRaD structure (Introduction, Methods, Results and Discussion), using a corpus of over 85,000 research articles published in the seven PLOS journals. We provide evidence that 84% of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Semantic Web and Ontologies
