LLM-as-classifier: Semi-Supervised, Iterative Framework for Hierarchical Text Classification using Large Language Models

Doohee You; Andy Parisi; Zach Vander Velden; Lara Dantas Inojosa

arXiv:2508.16478·cs.CL·August 25, 2025

LLM-as-classifier: Semi-Supervised, Iterative Framework for Hierarchical Text Classification using Large Language Models

Doohee You, Andy Parisi, Zach Vander Velden, Lara Dantas Inojosa

PDF

TL;DR

This paper presents a semi-supervised, iterative framework leveraging Large Language Models for hierarchical text classification, addressing industry challenges of robustness, scalability, and adaptability in real-world applications.

Contribution

It introduces a novel human-in-the-loop, semi-supervised approach that combines prompt refinement and hierarchical expansion for scalable, adaptable classification systems using LLMs.

Findings

01

Effective in reducing resource requirements compared to fine-tuning

02

Improves classification accuracy with iterative prompt refinement

03

Supports continuous monitoring and adaptation in production

Abstract

The advent of Large Language Models (LLMs) has provided unprecedented capabilities for analyzing unstructured text data. However, deploying these models as reliable, robust, and scalable classifiers in production environments presents significant methodological challenges. Standard fine-tuning approaches can be resource-intensive and often struggle with the dynamic nature of real-world data distributions, which is common in the industry. In this paper, we propose a comprehensive, semi-supervised framework that leverages the zero- and few-shot capabilities of LLMs for building hierarchical text classifiers as a framework for a solution to these industry-wide challenges. Our methodology emphasizes an iterative, human-in-the-loop process that begins with domain knowledge elicitation and progresses through prompt refinement, hierarchical expansion, and multi-faceted validation. We introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.