On the Trade-off between Redundancy and Local Coherence in Summarization
Ronald Cardenas, Matthias Galle, Shay B. Cohen

TL;DR
This paper explores how controlling for redundancy and cohesion affects the quality of extractive summaries, especially for long, redundant documents, using both supervised and unsupervised optimization methods inspired by cognitive theories.
Contribution
It introduces two unsupervised systems that incorporate psycholinguistic principles to balance informativeness, redundancy, and cohesion in summaries, demonstrating their effectiveness through evaluations.
Findings
Systems optimizing for cohesion produce more organized summaries.
Unsupervised systems achieve high cohesion but with reduced informativeness.
Cognitive-inspired models influence the trade-off between summary properties.
Abstract
Extractive summaries are usually presented as lists of sentences with no expected cohesion between them and with plenty of redundant information if not accounted for. In this paper, we investigate the trade-offs incurred when aiming to control for inter-sentential cohesion and redundancy in extracted summaries, and their impact on their informativeness. As case study, we focus on the summarization of long, highly redundant documents and consider two optimization scenarios, reward-guided and with no supervision. In the reward-guided scenario, we compare systems that control for redundancy and cohesion during sentence scoring. In the unsupervised scenario, we introduce two systems that aim to control all three properties -- informativeness, redundancy, and cohesion -- in a principled way. Both systems implement a psycholinguistic theory that simulates how humans keep track of relevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
