Sparsemax and Relaxed Wasserstein for Topic Sparsity
Tianyi Lin, Zhiyue Hu, Xin Guo

TL;DR
This paper introduces two neural models utilizing Gaussian sparsemax and relaxed Wasserstein divergence to effectively capture topic sparsity in short and social media texts, improving analysis accuracy and stability.
Contribution
The paper proposes novel neural models with sparse posterior distributions for topic modeling, using Gaussian sparsemax and relaxed Wasserstein divergence, enhancing stability and performance over existing methods.
Findings
Models outperform probabilistic and neural baselines
Effective in capturing topic sparsity in short texts
Demonstrated on large diverse text corpora
Abstract
Topic sparsity refers to the observation that individual documents usually focus on several salient topics instead of covering a wide variety of topics, and a real topic adopts a narrow range of terms instead of a wide coverage of the vocabulary. Understanding this topic sparsity is especially important for analyzing user-generated web content and social media, which are featured in the form of extremely short posts and discussions. As topic sparsity of individual documents in online social media increases, so does the difficulty of analyzing the online text sources using traditional methods. In this paper, we propose two novel neural models by providing sparse posterior distributions over topics based on the Gaussian sparsemax construction, enabling efficient training by stochastic backpropagation. We construct an inference network conditioned on the input data and infer the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text and Document Classification Technologies · Natural Language Processing Techniques
MethodsSoftmax
