Impact of Stop Sets on Stopping Active Learning for Text Classification
Luke Kurlandski, Michael Bloodgood

TL;DR
This paper examines how the choice of stop sets influences the effectiveness of stopping methods in active learning for text classification, revealing significant performance differences and emphasizing the importance of unbiased, representative stop sets.
Contribution
It provides the first comprehensive analysis of stop set choices on stopping method performance, highlighting the superiority of unbiased, representative stop sets and their impact on stability-based methods.
Findings
Unbiased stop sets outperform biased ones in stopping method performance.
Stability-based stopping methods perform better with unbiased stop sets.
Stop set choice significantly affects the practical effectiveness of active learning.
Abstract
Active learning is an increasingly important branch of machine learning and a powerful technique for natural language processing. The main advantage of active learning is its potential to reduce the amount of labeled data needed to learn high-performing models. A vital aspect of an effective active learning algorithm is the determination of when to stop obtaining additional labeled data. Several leading state-of-the-art stopping methods use a stop set to help make this decision. However, there has been relatively less attention given to the choice of stop set than to the stopping algorithms that are applied on the stop set. Different choices of stop sets can lead to significant differences in stopping method performance. We investigate the impact of different stop set choices on different stopping methods. This paper shows the choice of the stop set can have a significant impact on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Data Stream Mining Techniques
