Document summarization using positive pointwise mutual information
Aji S, Ramachandra Kaimal

TL;DR
This paper introduces a novel document summarization method that leverages positive pointwise mutual information to weight term-sentence matrices, improving the extraction of significant sentences for better summaries.
Contribution
The paper proposes a new summarization technique using positive pointwise mutual information for semantic similarity measurement, outperforming existing methods on large documents.
Findings
Outperforms most existing summarization methods
Effective in handling large documents
Uses semantic similarity for better sentence selection
Abstract
The degree of success in document summarization processes depends on the performance of the method used in identifying significant sentences in the documents. The collection of unique words characterizes the major signature of the document, and forms the basis for Term-Sentence-Matrix (TSM). The Positive Pointwise Mutual Information, which works well for measuring semantic similarity in the Term-Sentence-Matrix, is used in our method to assign weights for each entry in the Term-Sentence-Matrix. The Sentence-Rank-Matrix generated from this weighted TSM, is then used to extract a summary from the document. Our experiments show that such a method would outperform most of the existing methods in producing summaries from large documents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
