HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long   Document Summarization

Shuyang Cao; Lu Wang

arXiv:2203.10741·cs.CL·March 22, 2022

HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long Document Summarization

Shuyang Cao, Lu Wang

PDF

Open Access

TL;DR

HIBRIDS introduces hierarchical biases into Transformer attention to better encode document structure, improving long document summarization and hierarchical question-summary generation with new annotated datasets.

Contribution

The paper proposes HIBRIDS, a novel method injecting hierarchical biases into Transformer attention, and introduces a new hierarchical question-summary generation task with a labeled dataset.

Findings

01

HIBRIDS outperforms baselines in hierarchy quality and content coverage.

02

The model improves long-form summary generation measured by ROUGE scores.

03

Human judges favor HIBRIDS-generated hierarchies and summaries.

Abstract

Document structure is critical for efficient information consumption. However, it is challenging to encode it efficiently into the modern Transformer architecture. In this work, we present HIBRIDS, which injects Hierarchical Biases foR Incorporating Document Structure into the calculation of attention scores. We further present a new task, hierarchical question-summary generation, for summarizing salient content in the source document into a hierarchy of questions and summaries, where each follow-up question inquires about the content of its parent question-summary pair. We also annotate a new dataset with 6,153 question-summary hierarchies labeled on long government reports. Experiment results show that our model produces better question-summary hierarchies than comparisons on both hierarchy quality and content coverage, a finding also echoed by human judges. Additionally, our model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Residual Connection · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Label Smoothing · Dropout