LAUD: Integrating Large Language Models with Active Learning for Unlabeled Data
Tzu-Hsuan Chou, Chun-Nan Chou

TL;DR
This paper introduces LAUD, a framework that combines large language models with active learning to improve performance on unlabeled data, reducing reliance on prompt-based methods and enhancing classification accuracy.
Contribution
LAUD is a novel framework that integrates LLMs with active learning, effectively addressing the cold-start problem and improving performance on unlabeled datasets.
Findings
LAUD outperforms zero-shot and few-shot learning methods.
Constructs initial labels using zero-shot learning.
Effective on commodity name classification tasks.
Abstract
Large language models (LLMs) have shown a remarkable ability to generalize beyond their pre-training data, and fine-tuning LLMs can elevate performance to human-level and beyond. However, in real-world scenarios, lacking labeled data often prevents practitioners from obtaining well-performing models, thereby forcing practitioners to highly rely on prompt-based approaches that are often tedious, inefficient, and driven by trial and error. To alleviate this issue of lacking labeled data, we present a learning framework integrating LLMs with active learning for unlabeled dataset (LAUD). LAUD mitigates the cold-start problem by constructing an initial label set with zero-shot learning. Experimental results show that LLMs derived from LAUD outperform LLMs with zero-shot or few-shot learning on commodity name classification tasks, demonstrating the effectiveness of LAUD.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Algorithms · Text Readability and Simplification
