A Large Language Model Guided Topic Refinement Mechanism for Short Text Modeling
Shuyu Chang, Rui Wang, Peng Ren, Qi Wang, Haiping Huang

TL;DR
This paper presents a novel, model-agnostic topic refinement mechanism that uses Large Language Models to improve the coherence and quality of topics extracted from short texts, enhancing downstream classification performance.
Contribution
It introduces a LLM-guided post-processing approach for short-text topic modeling that refines topics through prompt engineering, addressing the limitations of traditional models.
Findings
Topic Refinement improves topic coherence across datasets.
Enhanced topics lead to better text classification results.
The mechanism is model-agnostic and adaptable to various topic models.
Abstract
Modeling topics effectively in short texts, such as tweets and news snippets, is crucial to capturing rapidly evolving social trends. Existing topic models often struggle to accurately capture the underlying semantic patterns of short texts, primarily due to the sparse nature of such data. This nature of texts leads to an unavoidable lack of co-occurrence information, which hinders the coherence and granularity of mined topics. This paper introduces a novel model-agnostic mechanism, termed Topic Refinement, which leverages the advanced text comprehension capabilities of Large Language Models (LLMs) for short-text topic modeling. Unlike traditional methods, this post-processing mechanism enhances the quality of topics extracted by various topic modeling methods through prompt engineering. We guide LLMs in identifying semantically intruder words within the extracted topics and suggesting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Computational and Text Analysis Methods
