Sample Noise Impact on Active Learning
Alexandre Abraham, L\'eo Dreyfus-Schmidt

TL;DR
This paper investigates how sample noise affects active learning and introduces a robust sampling method that improves performance on synthetic data, highlighting the importance of noise awareness in active learning strategies.
Contribution
It proposes Incremental Weighted K-Means, a noise-aware sampling method that enhances active learning performance, especially on synthetic datasets.
Findings
Significant improvement on synthetic tasks
Marginal uplift on real-life tasks
Highlights importance of noise knowledge in active learning
Abstract
This work explores the effect of noisy sample selection in active learning strategies. We show on both synthetic problems and real-life use-cases that knowledge of the sample noise can significantly improve the performance of active learning strategies. Building on prior work, we propose a robust sampler, Incremental Weighted K-Means that brings significant improvement on the synthetic tasks but only a marginal uplift on real-life ones. We hope that the questions raised in this paper are of interest to the community and could open new paths for active learning research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Algorithms and Data Compression
