Coordinated Topic Modeling
Pritom Saha Akash, Jie Huang, Kevin Chen-Chuan Chang

TL;DR
This paper introduces coordinated topic modeling, a novel approach that uses reference axes to produce more interpretable and comparable topic representations across corpora, leveraging embedding-based methods and self-training.
Contribution
It proposes ECTM, an embedding-based coordinated topic model that incorporates reference axes and self-training to improve interpretability and corpus comparison.
Findings
ECTM outperforms baseline models in multiple domain experiments.
The model effectively captures corpus-specific semantics while maintaining global topic coherence.
Self-training enhances the model's ability to align topics with reference axes.
Abstract
We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus. It considers a set of well-defined topics like the axes of a semantic space with a reference representation. It then uses the axes to model a corpus for easily understandable representation. This new task helps represent a corpus more interpretably by reusing existing knowledge and benefits the corpora comparison task. We design ECTM, an embedding-based coordinated topic model that effectively uses the reference representation to capture the target corpus-specific aspects while maintaining each topic's global semantics. In ECTM, we introduce the topic- and document-level supervision with a self-training mechanism to solve the problem. Finally, extensive experiments on multiple domains show the superiority of our model over other baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Text and Document Classification Technologies
