Exploratory topic modeling with distributional semantics
Samuel R\"onnqvist

TL;DR
This paper introduces a novel exploratory topic modeling method that uses distributional semantics and word vectors to create a network-based, visually interactive map of topics, enhancing interpretability and exploration of large textual datasets.
Contribution
It proposes a new approach to topic modeling that maps topics as a network based on semantic similarities, improving interpretability over traditional probabilistic models.
Findings
Topics form clustered regions and concept gradients in the network.
The method supports interactive visual exploration of thematic structures.
It enhances interpretability of latent topics in large text corpora.
Abstract
As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge about the content is required and highly open-ended tasks can be supported. In the past few years, probabilistic topic modeling has emerged as a popular approach to this problem. Nevertheless, the representation of the latent topics as aggregations of semi-coherent terms limits their interpretability and level of detail. This paper presents an alternative approach to topic modeling that maps topics as a network for exploration, based on distributional semantics using learned word vectors. From the granular level of terms and their semantic similarity relations global topic structures emerge as clustered regions and gradients of concepts. Moreover,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
