NTLRAG: Narrative Topic Labels derived with Retrieval Augmented Generation
Lisa Grobelscheg, Ema Kahr, Mark Strembeck

TL;DR
NTLRAG is a scalable framework that generates human-interpretable narrative labels for topics in large text collections, improving interpretability over traditional keyword lists using retrieval augmented generation techniques.
Contribution
Introduces NTLRAG, a novel retrieval augmented generation framework that creates semantically rich, human-readable topic labels for social media datasets, enhancing interpretability of topic models.
Findings
User study shows superior interpretability of NTLRAG labels
Effective on datasets with over 6.7 million social media messages
Can be combined with any standard topic model
Abstract
Topic modeling has evolved as an important means to identify evident or hidden topics within large collections of text documents. Topic modeling approaches are often used for analyzing and making sense of social media discussions consisting of millions of short text messages. However, assigning meaningful topic labels to document clusters remains challenging, as users are commonly presented with unstructured keyword lists that may not accurately capture the respective core topic. In this paper, we introduce Narrative Topic Labels derived with Retrieval Augmented Generation (NTLRAG), a scalable and extensible framework that generates semantically precise and human-interpretable narrative topic labels. Our narrative topic labels provide a context-rich, intuitive concept to describe topic model output. In particular, NTLRAG uses retrieval augmented generation (RAG) techniques and considers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling · Sentiment Analysis and Opinion Mining
