Towards Better Understanding with Uniformity and Explicit Regularization of Embeddings in Embedding-based Neural Topic Models
Wei Shao, Lei Huang, Shuqi Liu, Shihua Ma, Linqi Song

TL;DR
This paper introduces an embedding regularization method with explicit constraints and uniformity evaluation to improve neural topic models, leading to better interpretability and performance.
Contribution
It proposes a novel regularization approach with uniformity metrics to analyze and enhance embedding-based neural topic models.
Findings
Model significantly outperforms baselines in topic quality and document modeling.
Embedding uniformity correlates with training progress and model performance.
Ablation studies confirm the impact of embedding constraints on results.
Abstract
Embedding-based neural topic models could explicitly represent words and topics by embedding them to a homogeneous feature space, which shows higher interpretability. However, there are no explicit constraints for the training of embeddings, leading to a larger optimization space. Also, a clear description of the changes in embeddings and the impact on model performance is still lacking. In this paper, we propose an embedding regularized neural topic model, which applies the specially designed training constraints on word embedding and topic embedding to reduce the optimization space of parameters. To reveal the changes and roles of embeddings, we introduce \textbf{uniformity} into the embedding-based neural topic model as the evaluation metric of embedding space. On this basis, we describe how embeddings tend to change during training via the changes in the uniformity of embeddings.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques
