Towards Big Topic Modeling
Jian-Feng Yan, Jia Zeng, Zhi-Qiang Liu, Yang Gao

TL;DR
This paper introduces a novel parallel topic modeling architecture based on power law to significantly reduce communication costs in big LDA tasks, combined with an online belief propagation algorithm for improved scalability and efficiency.
Contribution
It proposes a communication-efficient parallel architecture for big topic modeling based on power law, enhancing scalability and reducing communication overhead in multi-processor environments.
Findings
POBP achieves high accuracy in big topic modeling.
It significantly reduces communication time compared to existing methods.
It maintains constant memory usage regardless of data size.
Abstract
To solve the big topic modeling problem, we need to reduce both time and space complexities of batch latent Dirichlet allocation (LDA) algorithms. Although parallel LDA algorithms on the multi-processor architecture have low time and space complexities, their communication costs among processors often scale linearly with the vocabulary size and the number of topics, leading to a serious scalability problem. To reduce the communication complexity among processors for a better scalability, we propose a novel communication-efficient parallel topic modeling architecture based on power law, which consumes orders of magnitude less communication time when the number of topics is large. We combine the proposed communication-efficient parallel architecture with the online belief propagation (OBP) algorithm referred to as POBP for big topic modeling tasks. Extensive empirical results confirm that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Recommender Systems and Techniques · Text and Document Classification Technologies
MethodsLinear Discriminant Analysis
