Impact of Batch Size on Stopping Active Learning for Text Classification
Garrett Beatty, Ethan Kochis, Michael Bloodgood

TL;DR
This paper investigates how batch size affects the performance of stopping methods in active learning for text classification, revealing that larger batch sizes impair stopping performance but can be mitigated by adjusting window size parameters.
Contribution
It demonstrates that larger batch sizes negatively impact stopping methods in active learning and proposes a mitigation strategy by tuning window size parameters.
Findings
Large batch sizes degrade stopping method performance.
Adjusting window size mitigates degradation effects.
Smaller window sizes improve stopping accuracy with large batches.
Abstract
When using active learning, smaller batch sizes are typically more efficient from a learning efficiency perspective. However, in practice due to speed and human annotator considerations, the use of larger batch sizes is necessary. While past work has shown that larger batch sizes decrease learning efficiency from a learning curve perspective, it remains an open question how batch size impacts methods for stopping active learning. We find that large batch sizes degrade the performance of a leading stopping method over and above the degradation that results from reduced learning efficiency. We analyze this degradation and find that it can be mitigated by changing the window size parameter of how many past iterations of learning are taken into account when making the stopping decision. We find that when using larger batch sizes, stopping methods are more effective when smaller window sizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
