Keyword Assisted Topic Models
Shusei Eshima, Kosuke Imai, Tomoya Sasaki

TL;DR
This paper introduces keyATM, a keyword-assisted topic model that improves interpretability and accuracy of topic analysis by incorporating prior keyword labels, addressing limitations of unsupervised models.
Contribution
The paper presents keyATM, a novel supervised topic modeling approach that uses keywords to enhance interpretability and performance over traditional unsupervised models.
Findings
keyATM yields more interpretable topics
It improves document classification accuracy
It is less sensitive to the number of topics
Abstract
In recent years, fully automated content analysis based on probabilistic topic models has become popular among social scientists because of their scalability. The unsupervised nature of the models makes them suitable for exploring topics in a corpus without prior knowledge. However, researchers find that these models often fail to measure specific concepts of substantive interest by inadvertently creating multiple topics with similar content and combining distinct themes into a single topic. In this paper, we empirically demonstrate that providing a small number of keywords can substantially enhance the measurement performance of topic models. An important advantage of the proposed keyword assisted topic model (keyATM) is that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post-hoc…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling · Advanced Text Analysis Techniques
