Topic Modeling Based Extractive Text Summarization

Kalliath Abdul Rasheed Issam; Shivam Patel; Subalalitha C. N

arXiv:2106.15313·cs.CL·June 30, 2021

Topic Modeling Based Extractive Text Summarization

Kalliath Abdul Rasheed Issam, Shivam Patel, Subalalitha C. N

PDF

TL;DR

This paper introduces a novel extractive summarization method using topic modeling to cluster content and generate summaries, tested on the challenging WikiHow dataset, showing promising ROUGE scores compared to existing models.

Contribution

The paper presents a new extractive summarization approach that leverages topic modeling for clustering and summarization, specifically addressing the challenges of the WikiHow dataset.

Findings

01

Model achieves encouraging ROUGE results.

02

Effective in capturing varied information in source documents.

03

Performs well on a challenging dataset compared to existing models.

Abstract

Text summarization is an approach for identifying important information present within text documents. This computational technique aims to generate shorter versions of the source text, by including only the relevant and salient information present within the source text. In this paper, we propose a novel method to summarize a text document by clustering its contents based on latent topics produced using topic modeling techniques and by generating extractive summaries for each of the identified text clusters. All extractive sub-summaries are later combined to generate a summary for any given source document. We utilize the lesser used and challenging WikiHow dataset in our approach to text summarization. This dataset is unlike the commonly used news datasets which are available for text summarization. The well-known news datasets present their most important information in the first few…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.