Parallel Hierarchical Transformer with Attention Alignment for   Abstractive Multi-Document Summarization

Ye Ma; Lu Zong

arXiv:2208.07845·cs.CL·August 17, 2022

Parallel Hierarchical Transformer with Attention Alignment for Abstractive Multi-Document Summarization

Ye Ma, Lu Zong

PDF

Open Access

TL;DR

This paper introduces a Parallel Hierarchical Transformer with attention alignment for multi-document summarization, improving coverage and quality of generated summaries by leveraging hierarchical attention and attention calibration.

Contribution

The study proposes a novel hierarchical Transformer architecture with attention alignment for MDS, enhancing dependency modeling and summary coverage over existing models.

Findings

01

Improved ROUGE scores over baselines.

02

Higher quality summaries in human evaluations.

03

Efficient processing with low computational cost.

Abstract

In comparison to single-document summarization, abstractive Multi-Document Summarization (MDS) brings challenges on the representation and coverage of its lengthy and linked sources. This study develops a Parallel Hierarchical Transformer (PHT) with attention alignment for MDS. By incorporating word- and paragraph-level multi-head attentions, the hierarchical architecture of PHT allows better processing of dependencies at both token and document levels. To guide the decoding towards a better coverage of the source documents, the attention-alignment mechanism is then introduced to calibrate beam search with predicted optimal attention distributions. Based on the WikiSum data, a comprehensive evaluation is conducted to test improvements on MDS by the proposed architecture. By better handling the inner- and cross-document information, results in both ROUGE and human evaluation suggest that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Test · Linear Layer · Dense Connections · Absolute Position Encodings · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Residual Connection