JADS: A Framework for Self-supervised Joint Aspect Discovery and   Summarization

Xiaobo Guo; Jay Desai; Srinivasan H. Sengamedu

arXiv:2405.18642·cs.AI·May 30, 2024

JADS: A Framework for Self-supervised Joint Aspect Discovery and Summarization

Xiaobo Guo, Jay Desai, Srinivasan H. Sengamedu

PDF

Open Access

TL;DR

JADS introduces a self-supervised framework that jointly discovers aspects and summarizes text documents in a single step, outperforming traditional two-step methods and enhancing clustering and factual accuracy.

Contribution

The paper presents a novel self-supervised joint aspect discovery and summarization model that integrates topic detection and summarization into one process, improving performance and stability.

Findings

01

Outperforms two-step baselines in aspect discovery and summarization.

02

Pretraining enhances model performance and stability.

03

Embeddings from JADS improve clustering and semantic alignment.

Abstract

To generate summaries that include multiple aspects or topics for text documents, most approaches use clustering or topic modeling to group relevant sentences and then generate a summary for each group. These approaches struggle to optimize the summarization and clustering algorithms jointly. On the other hand, aspect-based summarization requires known aspects. Our solution integrates topic discovery and summarization into a single step. Given text data, our Joint Aspect Discovery and Summarization algorithm (JADS) discovers aspects from the input and generates a summary of the topics, in one step. We propose a self-supervised framework that creates a labeled dataset by first mixing sentences from multiple documents (e.g., CNN/DailyMail articles) as the input and then uses the article summaries from the mixture as the labels. The JADS model outperforms the two-step baselines. With…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis