Hierarchical Latent Semantic Mapping for Automated Topic Generation
Guorui Zhou, Guang Chen

TL;DR
This paper introduces Hierarchical Latent Semantic Mapping (HLSM), a novel method for automatic topic generation from large text corpora that leverages network community detection algorithms to overcome the need for predefining the number of topics.
Contribution
HLSM is a new approach that constructs a word association network and hierarchically detects topics, eliminating manual specification of topic number in traditional models.
Findings
HLSM outperforms several state-of-the-art topic modeling methods.
HLSM automatically determines the number of topics.
Experimental results show promising performance on multiple datasets.
Abstract
Much of information sits in an unprecedented amount of text data. Managing allocation of these large scale text data is an important problem for many areas. Topic modeling performs well in this problem. The traditional generative models (PLSA,LDA) are the state-of-the-art approaches in topic modeling and most recent research on topic generation has been focusing on improving or extending these models. However, results of traditional generative models are sensitive to the number of topics K, which must be specified manually. The problem of generating topics from corpus resembles community detection in networks. Many effective algorithms can automatically detect communities from networks without a manually specified number of the communities. Inspired by these algorithms, in this paper, we propose a novel method named Hierarchical Latent Semantic Mapping (HLSM), which automatically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Computational and Text Analysis Methods
