Coordinated Topic Modeling

Pritom Saha Akash; Jie Huang; Kevin Chen-Chuan Chang

arXiv:2210.08559·cs.CL·October 25, 2022

Coordinated Topic Modeling

Pritom Saha Akash, Jie Huang, Kevin Chen-Chuan Chang

PDF

Open Access

TL;DR

This paper introduces coordinated topic modeling, a novel approach that uses reference axes to produce more interpretable and comparable topic representations across corpora, leveraging embedding-based methods and self-training.

Contribution

It proposes ECTM, an embedding-based coordinated topic model that incorporates reference axes and self-training to improve interpretability and corpus comparison.

Findings

01

ECTM outperforms baseline models in multiple domain experiments.

02

The model effectively captures corpus-specific semantics while maintaining global topic coherence.

03

Self-training enhances the model's ability to align topics with reference axes.

Abstract

We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus. It considers a set of well-defined topics like the axes of a semantic space with a reference representation. It then uses the axes to model a corpus for easily understandable representation. This new task helps represent a corpus more interpretably by reusing existing knowledge and benefits the corpora comparison task. We design ECTM, an embedding-based coordinated topic model that effectively uses the reference representation to capture the target corpus-specific aspects while maintaining each topic's global semantics. In ECTM, we introduce the topic- and document-level supervision with a self-training mechanism to solve the problem. Finally, extensive experiments on multiple domains show the superiority of our model over other baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Text and Document Classification Technologies