ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios
Markus Bayer, Justin Lutz, Christian Reuter

TL;DR
ActiveLLM introduces a novel active learning method leveraging large language models to improve instance selection in few-shot text classification, addressing cold-start issues and outperforming traditional strategies.
Contribution
It presents ActiveLLM, a new approach that uses LLMs for active learning, enhancing performance in few-shot scenarios and aiding other strategies to overcome cold-start problems.
Findings
ActiveLLM outperforms traditional active learning methods in few-shot classification.
ActiveLLM improves the performance of BERT classifiers in limited data settings.
The approach can be extended to non-few-shot scenarios for iterative learning.
Abstract
Active learning is designed to minimize annotation efforts by prioritizing instances that most enhance learning. However, many active learning strategies struggle with a `cold-start' problem, needing substantial initial data to be effective. This limitation reduces their utility in the increasingly relevant few-shot scenarios, where the instance selection has a substantial impact. To address this, we introduce ActiveLLM, a novel active learning approach that leverages Large Language Models such as GPT-4, o1, Llama 3, or Mistral Large for selecting instances. We demonstrate that ActiveLLM significantly enhances the classification performance of BERT classifiers in few-shot scenarios, outperforming traditional active learning methods as well as improving the few-shot learning methods ADAPET, PERFECT, and SetFit. Additionally, ActiveLLM can be extended to non-few-shot scenarios, allowing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dense Connections · Attention Dropout · Linear Layer · Position-Wise Feed-Forward Layer · Weight Decay · Label Smoothing · Residual Connection · Absolute Position Encodings
