Is Automated Topic Model Evaluation Broken?: The Incoherence of   Coherence

Alexander Hoyle; Pranav Goel; Denis Peskov; Andrew Hian-Cheong; Jordan; Boyd-Graber; Philip Resnik

arXiv:2107.02173·cs.CL·October 29, 2021·23 cites

Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence

Alexander Hoyle, Pranav Goel, Denis Peskov, Andrew Hian-Cheong, Jordan, Boyd-Graber, Philip Resnik

PDF

Open Access 2 Repos 3 Datasets 1 Video

TL;DR

This paper questions the validity of automated topic coherence metrics by comparing them with human judgments and analyzing their consistency across classical and neural models.

Contribution

It highlights the validation gap in automated coherence measures for neural models and systematically evaluates models to reveal discrepancies with human assessments.

Findings

01

Automated coherence often disagrees with human judgments.

02

Neural models outperform classical models on automated metrics but not necessarily on human evaluations.

03

There is a significant standardization gap in topic model benchmarking.

Abstract

Topic model evaluation, like evaluation of other unsupervised methods, can be contentious. However, the field has coalesced around automated estimates of topic coherence, which rely on the frequency of word co-occurrences in a reference corpus. Contemporary neural topic models surpass classical ones according to these metrics. At the same time, topic model evaluation suffers from a validation gap: automated coherence, developed for classical models, has not been validated using human experimentation for neural models. In addition, a meta-analysis of topic modeling literature reveals a substantial standardization gap in automated topic modeling benchmarks. To address the validation gap, we compare automated coherence with the two most widely accepted human judgment tasks: topic rating and word intrusion. To address the standardization gap, we systematically evaluate a dominant classical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

Is Automated Topic Model Evaluation Broken? The Incoherence of Coherence· slideslive

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Explainable Artificial Intelligence (XAI)