Predicting Discourse Trees from Transformer-based Neural Summarizers

Wen Xiao; Patrick Huber; Giuseppe Carenini

arXiv:2104.07058·cs.CL·April 16, 2021

Predicting Discourse Trees from Transformer-based Neural Summarizers

Wen Xiao, Patrick Huber, Giuseppe Carenini

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that transformer-based neural summarizers inherently learn and encode discourse structures, which can be extracted from their self-attention matrices, revealing a bidirectional relationship between discourse understanding and summarization.

Contribution

It introduces a method to infer discourse trees from pre-trained summarizers' self-attention, showing they encode both dependency and constituency discourse information.

Findings

01

Summarizers learn discourse structures in their self-attention matrices.

02

Discourse information is encoded in a single attention head.

03

The learned discourse representations are transferable across domains.

Abstract

Previous work indicates that discourse information benefits summarization. In this paper, we explore whether this synergy between discourse and summarization is bidirectional, by inferring document-level discourse trees from pre-trained neural summarizers. In particular, we generate unlabeled RST-style discourse trees from the self-attention matrices of the transformer model. Experiments across models and datasets reveal that the summarizer learns both, dependency- and constituency-style discourse information, which is typically encoded in a single head, covering long- and short-distance discourse dependencies. Overall, the experimental results suggest that the learned discourse information is general and transferable inter-domain.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Wendy-Xiao/summ_guided_disco_parser
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques