HTMOT : Hierarchical Topic Modelling Over Time
Judicael Poumay, Ashwin Ittoo

TL;DR
HTMOT is a novel hierarchical and temporal topic modeling method that efficiently captures evolving topics and detailed sub-topics over time, providing more precise insights into large text corpora.
Contribution
The paper introduces HTMOT, a new hierarchical temporal topic model with an efficient Gibbs sampling implementation that improves the extraction of high-level and specific sub-topics.
Findings
Fast training procedure demonstrated.
Accurate extraction of high-level and sub-topics.
Effective in analyzing space industry developments in 2020.
Abstract
Over the years, topic models have provided an efficient way of extracting insights from text. However, while many models have been proposed, none are able to model topic temporality and hierarchy jointly. Modelling time provide more precise topics by separating lexically close but temporally distinct topics while modelling hierarchy provides a more detailed view of the content of a document corpus. In this study, we therefore propose a novel method, HTMOT, to perform Hierarchical Topic Modelling Over Time. We train HTMOT using a new implementation of Gibbs sampling, which is more efficient. Specifically, we show that only applying time modelling to deep sub-topics provides a way to extract specific stories or events while high level topics extract larger themes in the corpus. Our results show that our training procedure is fast and can extract accurate high-level topics and temporally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling · Data Quality and Management
