Evaluating Topic Quality with Posterior Variability

Linzi Xing; Michael J. Paul; Giuseppe Carenini

arXiv:1909.03524·cs.CL·September 17, 2019

Evaluating Topic Quality with Posterior Variability

Linzi Xing, Michael J. Paul, Giuseppe Carenini

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new metric for evaluating the quality of topics generated by LDA, based on the variability of their posterior distributions, which correlates well with human judgments.

Contribution

The paper proposes a novel posterior variability-based metric for topic quality assessment and demonstrates its superiority over existing methods through empirical evaluation.

Findings

01

The new metric achieves state-of-the-art correlation with human judgments.

02

Combining multiple metrics with supervised learning further improves topic quality estimation.

03

The approach is validated on three different corpora.

Abstract

Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters. We derive a novel measure of LDA topic quality using the variability of the posterior distributions. Compared to several existing baselines for automatic topic evaluation, the proposed metric achieves state-of-the-art correlations with human judgments of topic quality in experiments on three corpora. We additionally demonstrate that topic quality estimation can be further improved using a supervised estimator that combines multiple metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lxing532/topic_variability
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods

MethodsLinear Discriminant Analysis