Understanding Neural Abstractive Summarization Models via Uncertainty

Jiacheng Xu; Shrey Desai; Greg Durrett

arXiv:2010.07882·cs.CL·October 16, 2020·1 cites

Understanding Neural Abstractive Summarization Models via Uncertainty

Jiacheng Xu, Shrey Desai, Greg Durrett

PDF

Open Access 1 Repo

TL;DR

This paper investigates the uncertainty in neural abstractive summarization models, revealing how entropy relates to token copying, sentence position, and syntactic factors, and linking attention mechanisms to these uncertainties.

Contribution

It introduces an analysis of model uncertainty in summarization, connecting entropy to token copying behavior and attention, providing insights into model interpretability.

Findings

01

Low entropy correlates with token copying.

02

Uncertainty varies with sentence position and syntactic distance.

03

Attention mechanisms relate to observed uncertainty patterns.

Abstract

An advantage of seq2seq abstractive summarization models is that they generate text in a free-form manner, but this flexibility makes it difficult to interpret model behavior. In this work, we analyze summarization decoders in both blackbox and whitebox ways by studying on the entropy, or uncertainty, of the model's token-level predictions. For two strong pre-trained models, PEGASUS and BART on two summarization datasets, we find a strong correlation between low prediction entropy and where the model copies tokens rather than generating novel text. The decoder's uncertainty also connects to factors like sentence position and syntactic distance between adjacent pairs of tokens, giving a sense of what factors make a context particularly selective for the model's next output token. Finally, we study the relationship of decoder uncertainty and attention behavior to understand how attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiacheng-xu/text-sum-uncertainty
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsPEGASUS · Linear Layer · Adam · Byte Pair Encoding · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Tanh Activation · Dropout