Word Network Topic Model: A Simple but General Solution for Short and Imbalanced Texts
Yuan Zuo, Jichang Zhao, Ke Xu

TL;DR
The paper introduces WNTM, a simple, effective word network-based model for topic detection in short and imbalanced texts, outperforming traditional models like LDA.
Contribution
WNTM models word distributions instead of document topics, addressing sparsity and imbalance efficiently and enhancing detection of rare and emerging topics.
Findings
WNTM outperforms baseline methods on short and normal texts.
WNTM effectively detects emerging topics early.
WNTM maintains low computational complexity.
Abstract
The short text has been the prevalent format for information of Internet in recent decades, especially with the development of online social media, whose millions of users generate a vast number of short messages everyday. Although sophisticated signals delivered by the short text make it a promising source for topic modeling, its extreme sparsity and imbalance brings unprecedented challenges to conventional topic models like LDA and its variants. Aiming at presenting a simple but general solution for topic modeling in short texts, we present a word co-occurrence network based model named WNTM to tackle the sparsity and imbalance simultaneously. Different from previous approaches, WNTM models the distribution over topics for each word instead of learning topics for each document, which successfully enhance the semantic density of data space without importing too much time or space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
MethodsLinear Discriminant Analysis
