Tethering Broken Themes: Aligning Neural Topic Models with Labels and   Authors

Mayank Nagda; Phil Ostheimer; Sophie Fellenz

arXiv:2410.18140·cs.IR·February 10, 2025

Tethering Broken Themes: Aligning Neural Topic Models with Labels and Authors

Mayank Nagda, Phil Ostheimer, Sophie Fellenz

PDF

Open Access 1 Video

TL;DR

FANToM is a new neural topic modeling method that effectively incorporates labels and authorship metadata, resulting in more interpretable topics, better alignment with human intentions, and insights into author interests.

Contribution

Introduces FANToM, a novel neural topic model that integrates labels and authorship information to improve topic interpretability and alignment.

Findings

01

Enhanced topic quality and alignment with human labels.

02

Ability to identify author interests and similarities.

03

Outperforms existing models in experiments.

Abstract

Topic models are a popular approach for extracting semantic information from large document collections. However, recent studies suggest that the topics generated by these models often do not align well with human intentions. Although metadata such as labels and authorship information are available, it has not yet been effectively incorporated into neural topic models. To address this gap, we introduce FANToM, a novel method to align neural topic models with both labels and authorship information. FANToM allows for the inclusion of this metadata when available, producing interpretable topics and author distributions for each topic. Our approach demonstrates greater expressiveness than conventional topic models by learning the alignment between labels, topics, and authors. Experimental results show that FANToM improves existing models in terms of both topic quality and alignment.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Tethering Broken Themes: Aligning Neural Topic Models with Labels and Authors· underline

Taxonomy

TopicsTopic Modeling

MethodsALIGN