Scalable and Robust Construction of Topical Hierarchies
Chi Wang, Xueqing Liu, Yanglei Song, Jiawei Han

TL;DR
This paper introduces a scalable, robust algorithm for automatically generating high-quality topical hierarchies from large text collections, enabling efficient and interactive knowledge organization.
Contribution
It presents a novel top-down recursive framework using tensor orthogonal decomposition for hierarchical topic modeling, significantly improving scalability and robustness.
Findings
Reduces hierarchy construction time by several orders of magnitude
Generates high-quality, robust topical hierarchies
Enables interactive hierarchy revision
Abstract
Automated generation of high-quality topical hierarchies for a text collection is a dream problem in knowledge engineering with many valuable applications. In this paper a scalable and robust algorithm is proposed for constructing a hierarchy of topics from a text collection. We divide and conquer the problem using a top-down recursive framework, based on a tensor orthogonal decomposition technique. We solve a critical challenge to perform scalable inference for our newly designed hierarchical topic model. Experiments with various real-world datasets illustrate its ability to generate robust, high-quality hierarchies efficiently. Our method reduces the time of construction by several orders of magnitude, and its robust feature renders it possible for users to interactively revise the hierarchy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Video Analysis and Summarization · Neural Networks and Applications
