Optimal Labeler Assignment and Sampling for Active Learning in the Presence of Imperfect Labels
Pouya Ahadi, Blair Winograd, Camille Zaug, Karunesh Arora, Lijun Wang, and Kamran Paynabar

TL;DR
This paper introduces a novel active learning framework that optimally assigns labelers and samples data points to minimize noise impact, improving classifier robustness with imperfect labels.
Contribution
It proposes an assignment model and a sampling method to reduce label noise effects in active learning, enhancing classification accuracy under noisy labeling conditions.
Findings
Significant improvement in classification performance over benchmark methods.
Effective reduction of noise impact through optimal labeler assignment.
Enhanced robustness of classifiers trained with noisy labels.
Abstract
Active Learning (AL) has garnered significant interest across various application domains where labeling training data is costly. AL provides a framework that helps practitioners query informative samples for annotation by oracles (labelers). However, these labels often contain noise due to varying levels of labeler accuracy. Additionally, uncertain samples are more prone to receiving incorrect labels because of their complexity. Learning from imperfectly labeled data leads to an inaccurate classifier. We propose a novel AL framework to construct a robust classification model by minimizing noise levels. Our approach includes an assignment model that optimally assigns query points to labelers, aiming to minimize the maximum possible noise within each cycle. Additionally, we introduce a new sampling method to identify the best query points, reducing the impact of label noise on classifier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Text and Document Classification Technologies
