TL;DR
This paper introduces a generic importance reweighting approach for weakly supervised learning that effectively handles various label noise types by leveraging a small trusted dataset alongside a larger noisy dataset.
Contribution
It proposes a novel, unified reweighting scheme for biquality data, enabling robust learning across different label noise scenarios.
Findings
Outperforms existing methods in noisy label settings
Effectively identifies noncorrupted examples in noisy datasets
Demonstrates robustness across multiple noise types and data qualities
Abstract
The field of Weakly Supervised Learning (WSL) has recently seen a surge of popularity, with numerous papers addressing different types of "supervision deficiencies", namely: poor quality, non adaptability, and insufficient quantity of labels. Regarding quality, label noise can be of different types, including completely-at-random, at-random or even not-at-random. All these kinds of label noise are addressed separately in the literature, leading to highly specialized approaches. This paper proposes an original, encompassing, view of Weakly Supervised Learning, which results in the design of generic approaches capable of dealing with any kind of label noise. For this purpose, an alternative setting called "Biquality data" is used. It assumes that a small trusted dataset of correctly labeled examples is available, in addition to an untrusted dataset of noisy examples. In this paper, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
