DiffETM: Diffusion Process Enhanced Embedded Topic Model
Wei Shao, Mingyang Liu, Linqi Song

TL;DR
DiffETM introduces a diffusion process into embedded topic modeling to better capture complex document-topic distributions, improving performance while maintaining ease of optimization.
Contribution
It proposes a novel diffusion-enhanced approach for embedded topic models, addressing the oversimplification of the logistic normal assumption.
Findings
Improved topic modeling accuracy on benchmark datasets
Effective in capturing complex document-topic distributions
Maintains ease of optimization
Abstract
The embedded topic model (ETM) is a widely used approach that assumes the sampled document-topic distribution conforms to the logistic normal distribution for easier optimization. However, this assumption oversimplifies the real document-topic distribution, limiting the model's performance. In response, we propose a novel method that introduces the diffusion process into the sampling process of document-topic distribution to overcome this limitation and maintain an easy optimization process. We validate our method through extensive experiments on two mainstream datasets, proving its effectiveness in improving topic modeling performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Neural Networks and Applications
MethodsDiffusion
