Diversity Enhanced Active Learning with Strictly Proper Scoring Rules
Wei Tan, Lan Du, Wray Buntine

TL;DR
This paper introduces a novel active learning acquisition function based on proper scoring rules, demonstrating improved robustness and performance in text classification tasks through theoretical analysis and extensive experiments.
Contribution
It proposes BEMPS, a new acquisition function using proper scores, along with a diversity-promoting batch selection method, enhancing active learning for text classification.
Findings
BEMPS outperforms other acquisition functions in experiments.
Proper scoring rules lead to more robust active learning.
Diversity in batch selection improves learning efficiency.
Abstract
We study acquisition functions for active learning (AL) for text classification. The Expected Loss Reduction (ELR) method focuses on a Bayesian estimate of the reduction in classification error, recently updated with Mean Objective Cost of Uncertainty (MOCU). We convert the ELR framework to estimate the increase in (strictly proper) scores like log probability or negative mean square error, which we call Bayesian Estimate of Mean Proper Scores (BEMPS). We also prove convergence results borrowing techniques used with MOCU. In order to allow better experimentation with the new acquisition functions, we develop a complementary batch AL algorithm, which encourages diversity in the vector of expected changes in scores for unlabelled data. To allow high performance text classifiers, we combine ensembling and dynamic validation set construction on pretrained language models. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Natural Language Processing Techniques · Topic Modeling
MethodsEarly Learning Regularization
