Enhance Topics Analysis based on Keywords Properties
Antonio Penta

TL;DR
This paper introduces a specificity score based on keyword properties to identify the most informative topics in topic modeling, improving the evaluation process by reducing information loss compared to existing coherence scores.
Contribution
The paper proposes a novel specificity score for keywords that enhances the selection of informative topics in topic modeling, addressing evaluation challenges.
Findings
The specificity score effectively identifies informative topics.
The approach reduces information loss compared to coherence scores.
Experimental results show improved topic selection accuracy.
Abstract
Topic Modelling is one of the most prevalent text analysis technique used to explore and retrieve collection of documents. The evaluation of the topic model algorithms is still a very challenging tasks due to the absence of gold-standard list of topics to compare against for every corpus. In this work, we present a specificity score based on keywords properties that is able to select the most informative topics. This approach helps the user to focus on the most informative topics. In the experiments, we show that we are able to compress the state-of-the-art topic modelling results of different factors with an information loss that is much lower than the solution based on the recent coherence score presented in literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Technology and Data Analysis
