Monitoring geometrical properties of word embeddings for detecting the emergence of new topics
Cl\'ement Christophe, Julien Velcin, Jairo Cugliari, Manel Boumghar,, Philippe Suignard

TL;DR
This paper introduces a method for early detection of slowly emerging topics by monitoring geometrical properties of word embeddings, demonstrating improved performance over existing methods on public datasets.
Contribution
It proposes a novel approach that uses geometrical properties of word embeddings to detect emerging topics earlier than traditional methods.
Findings
Outperforms state-of-the-art methods on press and scientific datasets
Provides a quantitative framework for evaluating early topic detection
Effectively detects weak signals indicating new topic emergence
Abstract
Slow emerging topic detection is a task between event detection, where we aggregate behaviors of different words on short period of time, and language evolution, where we monitor their long term evolution. In this work, we tackle the problem of early detection of slowly emerging new topics. To this end, we gather evidence of weak signals at the word level. We propose to monitor the behavior of words representation in an embedding space and use one of its geometrical properties to characterize the emergence of topics. As evaluation is typically hard for this kind of task, we present a framework for quantitative evaluation. We show positive results that outperform state-of-the-art methods on two public datasets of press and scientific articles.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Complex Network Analysis Techniques · Topic Modeling
