TL;DR
This paper introduces a new discriminative topic mining task guided by user-provided category names, proposing a novel embedding method called CatE that improves topic relevance and downstream classification performance.
Contribution
The paper presents CatE, a category-name guided text embedding approach for discriminative topic mining, leveraging minimal supervision to produce more relevant and useful topics.
Findings
CatE effectively mines high-quality, category-guided topics.
CatE improves downstream classification and lexical entailment tasks.
The method outperforms unsupervised topic models in relevance and utility.
Abstract
Mining a set of meaningful and distinctive topics automatically from massive text corpora has broad applications. Existing topic models, however, typically work in a purely unsupervised way, which often generate topics that do not fit users' particular needs and yield suboptimal performance on downstream tasks. We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora. This new task not only helps a user understand clearly and distinctively the topics he/she is most interested in, but also benefits directly keyword-driven classification tasks. We develop CatE, a novel category-name guided text embedding method for discriminative topic mining, which effectively leverages minimal user guidance to learn a discriminative embedding space and discover category representative terms in an iterative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
