# Scientific document summarization via citation contextualization and   scientific discourse

**Authors:** Arman Cohan, Nazli Goharian

arXiv: 1706.03449 · 2017-06-13

## TL;DR

This paper introduces a novel framework for scientific document summarization that leverages citation contextualization and discourse analysis to generate more accurate and informative summaries, significantly outperforming existing methods.

## Contribution

It presents new methods for contextualizing citations and identifying discourse facets, enhancing scientific summarization accuracy across biomedical and computational linguistics domains.

## Key findings

- Improved summarization performance over state-of-the-art methods
- Effective citation contextualization using query reformulation, embeddings, and supervised learning
- Enhanced understanding of scientific discourse structure

## Abstract

The rapid growth of scientific literature has made it difficult for the researchers to quickly learn about the developments in their respective fields. Scientific document summarization addresses this challenge by providing summaries of the important contributions of scientific papers. We present a framework for scientific summarization which takes advantage of the citations and the scientific discourse structure. Citation texts often lack the evidence and context to support the content of the cited paper and are even sometimes inaccurate. We first address the problem of inaccuracy of the citation texts by finding the relevant context from the cited paper. We propose three approaches for contextualizing citations which are based on query reformulation, word embeddings, and supervised learning. We then train a model to identify the discourse facets for each citation. We finally propose a method for summarizing scientific papers by leveraging the faceted citations and their corresponding contexts. We evaluate our proposed method on two scientific summarization datasets in the biomedical and computational linguistics domains. Extensive evaluation results show that our methods can improve over the state of the art by large margins.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.03449/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1706.03449/full.md

## References

80 references — full list in the complete paper: https://tomesphere.com/paper/1706.03449/full.md

---
Source: https://tomesphere.com/paper/1706.03449