AgriLens: Semantic Retrieval in Agricultural Texts Using Topic Modeling and Language Models
Heba Shakeel, Tanvir Ahmad, Tanya Liyaqat, Chandni Saxena

TL;DR
This paper introduces AgriLens, a framework combining topic modeling and language models for interpretable, scalable semantic retrieval in large agricultural texts, enabling effective organization and summarization with minimal labeled data.
Contribution
It presents a unified approach integrating BERTopic, zero-shot labeling, and semantic search for agricultural texts, enhancing interpretability and scalability.
Findings
Effective extraction of coherent topics from agricultural texts
Zero-shot topic labeling and summarization capabilities
Improved retrieval accuracy with dense embeddings
Abstract
As the volume of unstructured text continues to grow across domains, there is an urgent need for scalable methods that enable interpretable organization, summarization, and retrieval of information. This work presents a unified framework for interpretable topic modeling, zero-shot topic labeling, and topic-guided semantic retrieval over large agricultural text corpora. Leveraging BERTopic, we extract semantically coherent topics. Each topic is converted into a structured prompt, enabling a language model to generate meaningful topic labels and summaries in a zero-shot manner. Querying and document exploration are supported via dense embeddings and vector search, while a dedicated evaluation module assesses topical coherence and bias. This framework supports scalable and interpretable information access in specialized domains where labeled data is limited.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling · Information Retrieval and Search Behavior
