Sample Selection with Uncertainty of Losses for Learning with Noisy   Labels

Xiaobo Xia; Tongliang Liu; Bo Han; Mingming Gong; Jun Yu; Gang Niu,; Masashi Sugiyama

arXiv:2106.00445·cs.LG·June 2, 2021·49 cites

Sample Selection with Uncertainty of Losses for Learning with Noisy Labels

Xiaobo Xia, Tongliang Liu, Bo Han, Mingming Gong, Jun Yu, Gang Niu,, Masashi Sugiyama

PDF

Open Access 1 Video

TL;DR

This paper proposes a novel sample selection method for learning with noisy labels that uses interval estimation of losses to better distinguish between mislabeled data and underrepresented correctly labeled data, improving robustness.

Contribution

It introduces an uncertainty-aware loss estimation approach using confidence intervals, enhancing the ability to identify true data quality in noisy label scenarios.

Findings

01

Outperforms baseline methods across various noise types

02

More effectively distinguishes between mislabeled and underrepresented data

03

Demonstrates robustness to broad range of label noise

Abstract

In learning with noisy labels, the sample selection approach is very popular, which regards small-loss data as correctly labeled during training. However, losses are generated on-the-fly based on the model being trained with noisy labels, and thus large-loss data are likely but not certainly to be incorrect. There are actually two possibilities of a large-loss data point: (a) it is mislabeled, and then its loss decreases slower than other data, since deep neural networks "learn patterns first"; (b) it belongs to an underrepresented group of data and has not been selected yet. In this paper, we incorporate the uncertainty of losses by adopting interval estimation instead of point estimation of losses, where lower bounds of the confidence intervals of losses derived from distribution-free concentration inequalities, but not losses themselves, are used for sample selection. In this way, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sample Selection with Uncertainty of Losses for Learning with Noisy Labels· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Statistical Methods and Models · Advanced Statistical Process Monitoring