Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution
Jiacheng Xu, Greg Durrett

TL;DR
This paper introduces a two-step interpretability framework for neural abstractive summarization models, combining ablation to identify generation modes with attribution methods to analyze decision sources, revealing memorization and complex phenomena.
Contribution
It presents a novel approach to dissect and interpret summarization models by categorizing generation modes and applying attribution techniques to understand decision origins.
Findings
Identifies when models rely on input versus language modeling.
Detects memorized phrases and their training origins.
Analyzes complex phenomena like sentence fusion at the instance level.
Abstract
Despite the prominence of neural abstractive summarization models, we know little about how they actually form summaries and how to understand where their decisions come from. We propose a two-step method to interpret summarization model decisions. We first analyze the model's behavior by ablating the full model to categorize each decoder decision into one of several generation modes: roughly, is the model behaving like a language model, is it relying heavily on the input, or is it somewhere in between? After isolating decisions that do depend on the input, we explore interpreting these decisions using several different attribution methods. We compare these techniques based on their ability to select content and reconstruct the model's predicted token from perturbations of the input, thus revealing whether highlighted attributions are truly important for the generation of the next…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
