Simple Yet Effective Synthetic Dataset Construction for Unsupervised Opinion Summarization
Ming Shen, Jie Ma, Shuai Wang, Yogarshi Vyas, Kalpit Dixit, Miguel, Ballesteros, Yassine Benajiba

TL;DR
This paper introduces two simple unsupervised methods for opinion summarization that leverage synthetic datasets, significantly improving aspect-specific summary quality without requiring annotated data.
Contribution
It proposes two novel unsupervised approaches, SW-LOO and NLI-LOO, for generating opinion summaries using synthetic datasets, outperforming existing methods.
Findings
SW-LOO outperforms existing methods by 3.4 ROUGE-L on SPACE.
NLI-LOO outperforms existing approaches by 1.2 ROUGE-L on SPACE.
Both methods effectively generate aspect-specific summaries without annotated data.
Abstract
Opinion summarization provides an important solution for summarizing opinions expressed among a large number of reviews. However, generating aspect-specific and general summaries is challenging due to the lack of annotated data. In this work, we propose two simple yet effective unsupervised approaches to generate both aspect-specific and general opinion summaries by training on synthetic datasets constructed with aspect-related review contents. Our first approach, Seed Words Based Leave-One-Out (SW-LOO), identifies aspect-related portions of reviews simply by exact-matching aspect seed words and outperforms existing methods by 3.4 ROUGE-L points on SPACE and 0.5 ROUGE-1 point on OPOSUM+ for aspect-specific opinion summarization. Our second approach, Natural Language Inference Based Leave-One-Out (NLI-LOO) identifies aspect-related sentences utilizing an NLI model in a more general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
