Sample Noise Impact on Active Learning

Alexandre Abraham; L\'eo Dreyfus-Schmidt

arXiv:2109.01372·stat.ML·October 25, 2022·1 cites

Sample Noise Impact on Active Learning

Alexandre Abraham, L\'eo Dreyfus-Schmidt

PDF

Open Access 2 Repos

TL;DR

This paper investigates how sample noise affects active learning and introduces a robust sampling method that improves performance on synthetic data, highlighting the importance of noise awareness in active learning strategies.

Contribution

It proposes Incremental Weighted K-Means, a noise-aware sampling method that enhances active learning performance, especially on synthetic datasets.

Findings

01

Significant improvement on synthetic tasks

02

Marginal uplift on real-life tasks

03

Highlights importance of noise knowledge in active learning

Abstract

This work explores the effect of noisy sample selection in active learning strategies. We show on both synthetic problems and real-life use-cases that knowledge of the sample noise can significantly improve the performance of active learning strategies. Building on prior work, we propose a robust sampler, Incremental Weighted K-Means that brings significant improvement on the synthetic tasks but only a marginal uplift on real-life ones. We hope that the questions raised in this paper are of interest to the community and could open new paths for active learning research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Algorithms and Data Compression