MLego: Interactive and Scalable Topic Exploration Through Model Reuse
Fei Ye, Jiapan Liu, Yinan Jing, Zhenying He, Weirao Wang, X. Sean Wang

TL;DR
MLego is an interactive framework that enables real-time, scalable topic exploration by efficiently merging precomputed models, significantly reducing computation costs while maintaining high-quality results for large-scale text analysis.
Contribution
MLego introduces a novel approach for real-time topic modeling through model reuse, hierarchical plan search, and query reordering, enabling interactive exploration without retraining models.
Findings
MLego achieves significant reduction in computation costs.
MLego maintains high-quality topic modeling results.
MLego enables real-time, interactive exploration of large-scale datasets.
Abstract
With massive texts on social media, users and analysts often rely on topic modeling techniques to quickly extract key themes and gain insights. Traditional topic modeling techniques, such as Latent Dirichlet Allocation (LDA), provide valuable insights but are computationally expensive, making them impractical for real-time data analysis. Although recent advances in distributed training and fast sampling methods have improved efficiency, real-time topic exploration remains a significant challenge. In this paper, we present MLego, an interactive query framework designed to support real-time topic modeling analysis by leveraging model materialization and reuse. Instead of retraining models from scratch, MLego efficiently merges materialized topic models to construct approximate results at interactive speeds. To further enhance efficiency, we introduce a hierarchical plan search strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
