JADS: A Framework for Self-supervised Joint Aspect Discovery and Summarization
Xiaobo Guo, Jay Desai, Srinivasan H. Sengamedu

TL;DR
JADS introduces a self-supervised framework that jointly discovers aspects and summarizes text documents in a single step, outperforming traditional two-step methods and enhancing clustering and factual accuracy.
Contribution
The paper presents a novel self-supervised joint aspect discovery and summarization model that integrates topic detection and summarization into one process, improving performance and stability.
Findings
Outperforms two-step baselines in aspect discovery and summarization.
Pretraining enhances model performance and stability.
Embeddings from JADS improve clustering and semantic alignment.
Abstract
To generate summaries that include multiple aspects or topics for text documents, most approaches use clustering or topic modeling to group relevant sentences and then generate a summary for each group. These approaches struggle to optimize the summarization and clustering algorithms jointly. On the other hand, aspect-based summarization requires known aspects. Our solution integrates topic discovery and summarization into a single step. Given text data, our Joint Aspect Discovery and Summarization algorithm (JADS) discovers aspects from the input and generates a summary of the topics, in one step. We propose a self-supervised framework that creates a labeled dataset by first mixing sentences from multiple documents (e.g., CNN/DailyMail articles) as the input and then uses the article summaries from the mixture as the labels. The JADS model outperforms the two-step baselines. With…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis
