Fast Online EM for Big Topic Modeling
Jia Zeng, Zhi-Qiang Liu, Xiao-Qin Cao

TL;DR
This paper introduces FOEM, a fast online EM algorithm for big topic modeling that efficiently infers topic distributions from large data streams with low memory, suitable for lifelong learning scenarios.
Contribution
The paper presents FOEM, a novel online EM algorithm that converges to local optima for big LDA models with constant memory, outperforming existing online LDA methods.
Findings
FOEM converges to the local stationary point of LDA.
FOEM is more efficient than state-of-the-art online LDA algorithms.
FOEM handles big data and models on a standard PC.
Abstract
The expectation-maximization (EM) algorithm can compute the maximum-likelihood (ML) or maximum a posterior (MAP) point estimate of the mixture models or latent variable models such as latent Dirichlet allocation (LDA), which has been one of the most popular probabilistic topic modeling methods in the past decade. However, batch EM has high time and space complexities to learn big LDA models from big data streams. In this paper, we present a fast online EM (FOEM) algorithm that infers the topic distribution from the previously unseen documents incrementally with constant memory requirements. Within the stochastic approximation framework, we show that FOEM can converge to the local stationary point of the LDA's likelihood function. By dynamic scheduling for the fast speed and parameter streaming for the low memory usage, FOEM is more efficient for some lifelong topic modeling tasks than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Linear Discriminant Analysis
