LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large   Language Models

Xiaohao Yang; He Zhao; Dinh Phung; Wray Buntine; Lan Du

arXiv:2406.09008·cs.CL·January 15, 2025

LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models

Xiaohao Yang, He Zhao, Dinh Phung, Wray Buntine, Lan Du

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces WALM, a novel evaluation method using Large Language Models to assess topic models holistically, aligning well with human judgment and addressing limitations of existing metrics.

Contribution

Proposes WALM, a comprehensive LLM-based evaluation approach for topic models that jointly considers semantic quality of document representations and topics.

Findings

01

WALM aligns with human judgment in evaluating topic models.

02

WALM provides a more holistic assessment compared to traditional metrics.

03

The software implementation is publicly available.

Abstract

Topic modeling has been a widely used tool for unsupervised text analysis. However, comprehensive evaluations of a topic model remain challenging. Existing evaluation methods are either less comparable across different models (e.g., perplexity) or focus on only one specific aspect of a model (e.g., topic quality or document representation quality) at a time, which is insufficient to reflect the overall model performance. In this paper, we propose WALM (Word Agreement with Language Model), a new evaluation method for topic modeling that considers the semantic quality of document representations and topics in a joint manner, leveraging the power of Large Language Models (LLMs). With extensive experiments involving different types of topic models, WALM is shown to align with human judgment and can serve as a complementary evaluation method to the existing ones, bringing a new perspective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models· underline

Taxonomy

TopicsComputational and Text Analysis Methods

MethodsALIGN · Focus