Analysis of Stopping Active Learning based on Stabilizing Predictions

Michael Bloodgood; John Grothendieck

arXiv:1504.06329·cs.LG·April 27, 2015·19 cites

Analysis of Stopping Active Learning based on Stabilizing Predictions

Michael Bloodgood, John Grothendieck

PDF

Open Access

TL;DR

This paper provides a theoretical analysis of a stopping criterion for active learning in NLP based on stabilizing predictions, linking model agreement to performance bounds and emphasizing the method's practical advantages.

Contribution

It introduces the first theoretical framework for stopping active learning using stabilizing predictions, establishing bounds on performance differences based on Kappa agreement.

Findings

01

Kappa agreement bounds F-measure performance differences

02

Large stop set improves transferability to unseen data

03

Accurate Kappa estimates are crucial for effective stopping

Abstract

Within the natural language processing (NLP) community, active learning has been widely investigated and applied in order to alleviate the annotation bottleneck faced by developers of new NLP systems and technologies. This paper presents the first theoretical analysis of stopping active learning based on stabilizing predictions (SP). The analysis has revealed three elements that are central to the success of the SP method: (1) bounds on Cohen's Kappa agreement between successively trained models impose bounds on differences in F-measure performance of the models; (2) since the stop set does not have to be labeled, it can be made large in practice, helping to guarantee that the results transfer to previously unseen streams of examples at test/application time; and (3) good (low variance) sample estimates of Kappa between successive models can be obtained. Proofs of relationships between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · semigroups and automata theory · Algorithms and Data Compression