Towards Better Understanding with Uniformity and Explicit Regularization   of Embeddings in Embedding-based Neural Topic Models

Wei Shao; Lei Huang; Shuqi Liu; Shihua Ma; Linqi Song

arXiv:2206.07960·cs.CL·June 17, 2022

Towards Better Understanding with Uniformity and Explicit Regularization of Embeddings in Embedding-based Neural Topic Models

Wei Shao, Lei Huang, Shuqi Liu, Shihua Ma, Linqi Song

PDF

Open Access

TL;DR

This paper introduces an embedding regularization method with explicit constraints and uniformity evaluation to improve neural topic models, leading to better interpretability and performance.

Contribution

It proposes a novel regularization approach with uniformity metrics to analyze and enhance embedding-based neural topic models.

Findings

01

Model significantly outperforms baselines in topic quality and document modeling.

02

Embedding uniformity correlates with training progress and model performance.

03

Ablation studies confirm the impact of embedding constraints on results.

Abstract

Embedding-based neural topic models could explicitly represent words and topics by embedding them to a homogeneous feature space, which shows higher interpretability. However, there are no explicit constraints for the training of embeddings, leading to a larger optimization space. Also, a clear description of the changes in embeddings and the impact on model performance is still lacking. In this paper, we propose an embedding regularized neural topic model, which applies the specially designed training constraints on word embedding and topic embedding to reduce the optimization space of parameters. To reveal the changes and roles of embeddings, we introduce \textbf{uniformity} into the embedding-based neural topic model as the evaluation metric of embedding space. On this basis, we describe how embeddings tend to change during training via the changes in the uniformity of embeddings.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques