Rethinking Transformer-based Multi-document Summarization: An Empirical   Investigation

Congbo Ma; Wei Emma Zhang; Dileepa Pitawela; Haojie Zhuang; Yanfeng; Shu

arXiv:2407.11948·cs.CL·July 17, 2024

Rethinking Transformer-based Multi-document Summarization: An Empirical Investigation

Congbo Ma, Wei Emma Zhang, Dileepa Pitawela, Haojie Zhuang, Yanfeng, Shu

PDF

Open Access

TL;DR

This paper empirically investigates Transformer-based multi-document summarization, analyzing model behaviors, training strategies, and issues like repetition, to guide future improvements in summary quality.

Contribution

It provides a comprehensive empirical analysis of Transformer models in MDS, highlighting the impact of document boundaries, model structures, and training strategies.

Findings

01

Document boundary separators significantly affect performance.

02

Decoder sensitivity to noise impacts summary quality.

03

Repetition correlates with high uncertainty scores.

Abstract

The utilization of Transformer-based models prospers the growth of multi-document summarization (MDS). Given the huge impact and widespread adoption of Transformer-based models in various natural language processing tasks, investigating their performance and behaviors in the context of MDS becomes crucial for advancing the field and enhancing the quality of summary. To thoroughly examine the behaviours of Transformer-based MDS models, this paper presents five empirical studies on (1) measuring the impact of document boundary separators quantitatively; (2) exploring the effectiveness of different mainstream Transformer structures; (3) examining the sensitivity of the encoder and decoder; (4) discussing different training strategies; and (5) discovering the repetition in a summary generation. The experimental results on prevalent MDS datasets and eleven evaluation metrics show the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Advanced Text Analysis Techniques · Topic Modeling

MethodsAttention Is All You Need · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Adam · Dropout · Multi-Head Attention · Dense Connections