Improving the TENOR of Labeling: Re-evaluating Topic Models for Content Analysis
Zongxia Li, Andrew Mao, Daniel Stephens, Pranav Goel, Emily Walpole,, Alden Dima, Juan Fung, Jordan Boyd-Graber

TL;DR
This paper evaluates neural, supervised, and classical topic models in an interactive content analysis setting, revealing that automated metrics like coherence are insufficient and that neural models can outperform classical ones in practical tasks.
Contribution
It provides the first comprehensive evaluation of different topic models in a human-centered, interactive context, highlighting the limitations of automated metrics and the practical advantages of neural models.
Findings
Contextual Neural Topic Model performs best on human evaluations.
LDA remains competitive with neural models in simulated and user studies.
Automated coherence scores do not fully capture model effectiveness in real-world tasks.
Abstract
Topic models are a popular tool for understanding text collections, but their evaluation has been a point of contention. Automated evaluation metrics such as coherence are often used, however, their validity has been questioned for neural topic models (NTMs) and can overlook a models benefits in real world applications. To this end, we conduct the first evaluation of neural, supervised and classical topic models in an interactive task based setting. We combine topic models with a classifier and test their ability to help humans conduct content analysis and document annotation. From simulated, real user and expert pilot studies, the Contextual Neural Topic Model does the best on cluster evaluation metrics and human evaluations; however, LDA is competitive with two other NTMs under our simulated experiment and user study results, contrary to what coherence scores suggest. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods
MethodsLinear Discriminant Analysis
