TL;DR
HyperSum is an unsupervised extractive dialogue summarization method that leverages high-dimensional vector embeddings and clustering to produce accurate, faithful summaries efficiently, outperforming many existing methods.
Contribution
It introduces a novel high-dimensional embedding approach using pseudo-orthogonality for unsupervised extractive summarization, combining efficiency and accuracy.
Findings
HyperSum outperforms state-of-the-art summarizers in accuracy and faithfulness.
HyperSum is 10 to 100 times faster than comparable methods.
Open-sourced HyperSum as a strong baseline for unsupervised extractive summarization.
Abstract
We present HyperSum, an extractive summarization framework that captures both the efficiency of traditional lexical summarization and the accuracy of contemporary neural approaches. HyperSum exploits the pseudo-orthogonality that emerges when randomly initializing vectors at extremely high dimensions ("blessing of dimensionality") to construct representative and efficient sentence embeddings. Simply clustering the obtained embeddings and extracting their medoids yields competitive summaries. HyperSum often outperforms state-of-the-art summarizers -- in terms of both summary accuracy and faithfulness -- while being 10 to 100 times faster. We open-source HyperSum as a strong baseline for unsupervised extractive summarization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
