Discriminative Topic Mining via Category-Name Guided Text Embedding

Yu Meng; Jiaxin Huang; Guangyuan Wang; Zihan Wang; Chao Zhang; Yu; Zhang; Jiawei Han

arXiv:1908.07162·cs.CL·January 29, 2020

Discriminative Topic Mining via Category-Name Guided Text Embedding

Yu Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, Yu, Zhang, Jiawei Han

PDF

1 Repo

TL;DR

This paper introduces a new discriminative topic mining task guided by user-provided category names, proposing a novel embedding method called CatE that improves topic relevance and downstream classification performance.

Contribution

The paper presents CatE, a category-name guided text embedding approach for discriminative topic mining, leveraging minimal supervision to produce more relevant and useful topics.

Findings

01

CatE effectively mines high-quality, category-guided topics.

02

CatE improves downstream classification and lexical entailment tasks.

03

The method outperforms unsupervised topic models in relevance and utility.

Abstract

Mining a set of meaningful and distinctive topics automatically from massive text corpora has broad applications. Existing topic models, however, typically work in a purely unsupervised way, which often generate topics that do not fit users' particular needs and yield suboptimal performance on downstream tasks. We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora. This new task not only helps a user understand clearly and distinctively the topics he/she is most interested in, but also benefits directly keyword-driven classification tasks. We develop CatE, a novel category-name guided text embedding method for discriminative topic mining, which effectively leverages minimal user guidance to learn a discriminative embedding space and discover category representative terms in an iterative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yumeng5/CatE
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.