RankSum An unsupervised extractive text summarization based on rank fusion
A. Joshi, E. Fidalgo, E. Alegre, and R. Alaiz-Rodriguez

TL;DR
The paper introduces Ranksum, an unsupervised extractive summarization method that fuses multiple sentence features using learned weights, achieving superior results on benchmark datasets.
Contribution
It presents a novel unsupervised rank fusion approach combining topic, semantic, keyword, and positional features for extractive summarization.
Findings
Outperforms state-of-the-art methods on CNN/DailyMail and DUC 2002 datasets.
Effectively combines multiple features through learned weight fusion.
Demonstrates generalization of fusion weights across datasets.
Abstract
In this paper, we propose Ranksum, an approach for extractive text summarization of single documents based on the rank fusion of four multi-dimensional sentence features extracted for each sentence: topic information, semantic content, significant keywords, and position. The Ranksum obtains the sentence saliency rankings corresponding to each feature in an unsupervised way followed by the weighted fusion of the four scores to rank the sentences according to their significance. The scores are generated in completely unsupervised way, and a labeled document set is required to learn the fusion weights. Since we found that the fusion weights can generalize to other datasets, we consider the Ranksum as an unsupervised approach. To determine topic rank, we employ probabilistic topic models whereas semantic information is captured using sentence embeddings. To derive rankings using sentence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsSparse Evolutionary Training
