Concept Navigation and Classification via Open-Source Large Language Model Processing
Ma\"el Kubli

TL;DR
This paper introduces a hybrid framework utilizing open-source large language models combined with human validation to detect and classify latent constructs like frames, narratives, and topics from textual data, improving accuracy and interpretability.
Contribution
It presents a novel hybrid methodology integrating automated LLM processing with human-in-the-loop validation for concept detection and classification.
Findings
Effective in analyzing diverse datasets including policy debates and news articles.
Enhances accuracy and interpretability of latent construct detection.
Demonstrates versatility across political, media, and topic analysis tasks.
Abstract
This paper presents a novel methodological framework for detecting and classifying latent constructs, including frames, narratives, and topics, from textual data using Open-Source Large Language Models (LLMs). The proposed hybrid approach combines automated summarization with human-in-the-loop validation to enhance the accuracy and interpretability of construct identification. By employing iterative sampling coupled with expert refinement, the framework guarantees methodological robustness and ensures conceptual precision. Applied to diverse data sets, including AI policy debates, newspaper articles on encryption, and the 20 Newsgroups data set, this approach demonstrates its versatility in systematically analyzing complex political discourses, media framing, and topic classification tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
