Active Learning for Open-set Annotation
Kun-Peng Ning, Xun Zhao, Yu Li, Sheng-Jun Huang

TL;DR
This paper introduces LfOSA, a novel active learning framework for open-set annotation that effectively detects known class examples within unlabeled data containing unknowns, improving classification accuracy and reducing annotation costs.
Contribution
The paper presents the first active learning method specifically designed for open-set annotation, utilizing a Gaussian Mixture Model and temperature scaling to enhance known class detection.
Findings
Significantly improves selection quality of known class examples.
Achieves higher classification accuracy with lower annotation costs.
Outperforms state-of-the-art active learning methods in open-set scenarios.
Abstract
Existing active learning studies typically work in the closed-set setting by assuming that all data examples to be labeled are drawn from known classes. However, in real annotation tasks, the unlabeled data usually contains a large amount of examples from unknown classes, resulting in the failure of most active learning methods. To tackle this open-set annotation (OSA) problem, we propose a new active learning framework called LfOSA, which boosts the classification performance with an effective sampling strategy to precisely detect examples from known classes for annotation. The LfOSA framework introduces an auxiliary network to model the per-example max activation value (MAV) distribution with a Gaussian Mixture Model, which can dynamically select the examples with highest probability from known classes in the unlabeled set. Moreover, by reducing the temperature of the loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Pneumonia and Respiratory Infections · Text and Document Classification Technologies
