Text Summarization With Graph Attention Networks
Mohammadreza Ardestani, Yllias Chali

TL;DR
This paper explores the use of graph attention networks and RST/Co-reference graphs to improve text summarization, but finds that simpler models perform better, and introduces a new annotated dataset for future research.
Contribution
It demonstrates that graph attention networks may not outperform simpler models in summarization and provides a new annotated dataset with RST graphs for benchmarking.
Findings
Graph attention networks did not improve summarization performance.
A simple MLP architecture outperformed the graph-based models.
Annotated XSum with RST graphs to establish a new benchmark.
Abstract
This study aimed to leverage graph information, particularly Rhetorical Structure Theory (RST) and Co-reference (Coref) graphs, to enhance the performance of our baseline summarization models. Specifically, we experimented with a Graph Attention Network architecture to incorporate graph information. However, this architecture did not enhance the performance. Subsequently, we used a simple Multi-layer Perceptron architecture, which improved the results in our proposed model on our primary dataset, CNN/DM. Additionally, we annotated XSum dataset with RST graph information, establishing a benchmark for future graph-based summarization models. This secondary dataset posed multiple challenges, revealing both the merits and limitations of our models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
