Embedded Topic Models Enhanced by Wikification
Takashi Shibuya, Takehito Utsuro

TL;DR
This paper introduces an enhanced neural topic model that incorporates Wikipedia knowledge to recognize named entities, improving topic coherence and capturing temporal dynamics in document collections.
Contribution
It presents a novel method integrating Wikification into neural topic models, enabling better recognition of entities and temporal topic evolution.
Findings
Improved generalizability of topic models on news and dataset.
Enhanced recognition of named entities in topics.
Effective modeling of temporal topic development.
Abstract
Topic modeling analyzes a collection of documents to learn meaningful patterns of words. However, previous topic models consider only the spelling of words and do not take into consideration the homography of words. In this study, we incorporate the Wikipedia knowledge into a neural topic model to make it aware of named entities. We evaluate our method on two datasets, 1) news articles of \textit{New York Times} and 2) the AIDA-CoNLL dataset. Our experiments show that our method improves the performance of neural topic models in generalizability. Moreover, we analyze frequent terms in each topic and the temporal dependencies between topics to demonstrate that our entity-aware topic models can capture the time-series development of topics well.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsWeb Data Mining and Analysis
MethodsAttentive Walk-Aggregating Graph Neural Network
