Topic Modeling with Wasserstein Autoencoders
Feng Nan, Ran Ding, Ramesh Nallapati, Bing Xiang

TL;DR
This paper introduces a neural topic model using Wasserstein autoencoders with a Dirichlet prior, demonstrating improved topic coherence and diversity over existing models through MMD-based distribution matching.
Contribution
The paper presents a novel neural topic modeling approach within the Wasserstein autoencoder framework, utilizing MMD for distribution matching and incorporating randomness for better topic quality.
Findings
MMD outperforms GAN in high-dimensional Dirichlet distribution matching.
Incorporating randomness improves topic coherence.
Proposed metrics effectively evaluate topic diversity and quality.
Abstract
We propose a novel neural topic model in the Wasserstein autoencoders (WAE) framework. Unlike existing variational autoencoder based models, we directly enforce Dirichlet prior on the latent document-topic vectors. We exploit the structure of the latent space and apply a suitable kernel in minimizing the Maximum Mean Discrepancy (MMD) to perform distribution matching. We discover that MMD performs much better than the Generative Adversarial Network (GAN) in matching high dimensional Dirichlet distribution. We further discover that incorporating randomness in the encoder output during training leads to significantly more coherent topics. To measure the diversity of the produced topics, we propose a simple topic uniqueness metric. Together with the widely used coherence measure NPMI, we offer a more wholistic evaluation of topic quality. Experiments on several real datasets show that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Generative Adversarial Networks and Image Synthesis
MethodsSolana Customer Service Number +1-833-534-1729
