LAUD: Integrating Large Language Models with Active Learning for Unlabeled Data

Tzu-Hsuan Chou; Chun-Nan Chou

arXiv:2511.14738·cs.LG·November 19, 2025

LAUD: Integrating Large Language Models with Active Learning for Unlabeled Data

Tzu-Hsuan Chou, Chun-Nan Chou

PDF

Open Access

TL;DR

This paper introduces LAUD, a framework that combines large language models with active learning to improve performance on unlabeled data, reducing reliance on prompt-based methods and enhancing classification accuracy.

Contribution

LAUD is a novel framework that integrates LLMs with active learning, effectively addressing the cold-start problem and improving performance on unlabeled datasets.

Findings

01

LAUD outperforms zero-shot and few-shot learning methods.

02

Constructs initial labels using zero-shot learning.

03

Effective on commodity name classification tasks.

Abstract

Large language models (LLMs) have shown a remarkable ability to generalize beyond their pre-training data, and fine-tuning LLMs can elevate performance to human-level and beyond. However, in real-world scenarios, lacking labeled data often prevents practitioners from obtaining well-performing models, thereby forcing practitioners to highly rely on prompt-based approaches that are often tedious, inefficient, and driven by trial and error. To alleviate this issue of lacking labeled data, we present a learning framework integrating LLMs with active learning for unlabeled dataset (LAUD). LAUD mitigates the cold-start problem by constructing an initial label set with zero-shot learning. Experimental results show that LLMs derived from LAUD outperform LLMs with zero-shot or few-shot learning on commodity name classification tasks, demonstrating the effectiveness of LAUD.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Algorithms · Text Readability and Simplification