Analysis of GraphSum's Attention Weights to Improve the Explainability   of Multi-Document Summarization

M. Lautaro Hickmann; Fabian Wurzberger; Megi Hoxhalli; Arne; Lochner; Jessica T\"ollich; Ansgar Scherp

arXiv:2105.11908·cs.CL·December 8, 2022

Analysis of GraphSum's Attention Weights to Improve the Explainability of Multi-Document Summarization

M. Lautaro Hickmann, Fabian Wurzberger, Megi Hoxhalli, Arne, Lochner, Jessica T\"ollich, Ansgar Scherp

PDF

TL;DR

This paper analyzes attention weights in graph-based transformer models for multi-document summarization to enhance their explainability, revealing correlations between attention, source origin, and positional bias.

Contribution

It introduces a method to interpret attention weights in GraphSum, improving understanding of source contributions and positional influences in multi-document summarization.

Findings

01

Paragraph-level representations outperform sentence-level in summarization.

02

High correlation between attention weights and source similarity metrics.

03

Attention patterns reveal positional biases in summaries.

Abstract

Modern multi-document summarization (MDS) methods are based on transformer architectures. They generate state of the art summaries, but lack explainability. We focus on graph-based transformer models for MDS as they gained recent popularity. We aim to improve the explainability of the graph-based MDS by analyzing their attention weights. In a graph-based MDS such as GraphSum, vertices represent the textual units, while the edges form some similarity graph over the units. We compare GraphSum's performance utilizing different textual units, i. e., sentences versus paragraphs, on two news benchmark datasets, namely WikiSum and MultiNews. Our experiments show that paragraph-level representations provide the best summarization performance. Thus, we subsequently focus oAnalysisn analyzing the paragraph-level attention weights of GraphSum's multi-heads and decoding layers in order to improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.