Retrieval Augmented Generation for Topic Modeling in Organizational Research: An Introduction with Empirical Demonstration
Gerion Spielberger, Florian M. Artinger, Jochen Reb, Rudolf, Kerschreiter

TL;DR
This paper introduces Agentic Retrieval-Augmented Generation (Agentic RAG), a novel method combining retrieval, generation, and agent-driven learning for more efficient, interpretable, and reliable topic modeling using LLMs in organizational research.
Contribution
It presents a new approach, Agentic RAG, that improves upon existing topic modeling methods by integrating retrieval and iterative learning with LLMs, enhancing efficiency and interpretability.
Findings
Agentic RAG outperforms standard ML and LLM prompting in reliability and validity.
The method produces more semantically relevant and reproducible topics.
Empirical validation on Twitter data demonstrates its effectiveness.
Abstract
Analyzing textual data is the cornerstone of qualitative research. While traditional methods such as grounded theory and content analysis are widely used, they are labor-intensive and time-consuming. Topic modeling offers an automated complement. Yet, existing approaches, including LLM-based topic modeling, still struggle with issues such as high data preprocessing requirements, interpretability, and reliability. This paper introduces Agentic Retrieval-Augmented Generation (Agentic RAG) as a method for topic modeling with LLMs. It integrates three key components: (1) retrieval, enabling automatized access to external data beyond an LLM's pre-trained knowledge; (2) generation, leveraging LLM capabilities for text synthesis; and (3) agent-driven learning, iteratively refining retrieval and query formulation processes. To empirically validate Agentic RAG for topic modeling, we reanalyze a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Weight Decay · Attention Dropout · Byte Pair Encoding · Dense Connections · Residual Connection · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · WordPiece
