A Novel ILP Framework for Summarizing Content with High Lexical Variety
Wencan Luo, Fei Liu, Zitao Liu, and Diane Litman

TL;DR
This paper introduces an ILP-based summarization framework that effectively handles high lexical diversity in content by grouping semantically similar words, outperforming existing baselines across various datasets.
Contribution
The paper presents a novel ILP framework with low-rank approximation to improve summarization of lexically diverse content, addressing a key challenge in extractive summarization.
Findings
Outperforms several extractive baselines
Outperforms neural abstractive systems
Effective across multiple content types
Abstract
Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
