Evaluating Topic Quality with Posterior Variability
Linzi Xing, Michael J. Paul, Giuseppe Carenini

TL;DR
This paper introduces a new metric for evaluating the quality of topics generated by LDA, based on the variability of their posterior distributions, which correlates well with human judgments.
Contribution
The paper proposes a novel posterior variability-based metric for topic quality assessment and demonstrates its superiority over existing methods through empirical evaluation.
Findings
The new metric achieves state-of-the-art correlation with human judgments.
Combining multiple metrics with supervised learning further improves topic quality estimation.
The approach is validated on three different corpora.
Abstract
Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters. We derive a novel measure of LDA topic quality using the variability of the posterior distributions. Compared to several existing baselines for automatic topic evaluation, the proposed metric achieves state-of-the-art correlations with human judgments of topic quality in experiments on three corpora. We additionally demonstrate that topic quality estimation can be further improved using a supervised estimator that combines multiple metrics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
MethodsLinear Discriminant Analysis
