Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy   Labels

Ke Wang; Guillermo Ortiz-Jimenez; Rodolphe Jenatton; Mark Collier; Efi; Kokiopoulou; Pascal Frossard

arXiv:2310.06600·cs.LG·May 29, 2024

Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy Labels

Ke Wang, Guillermo Ortiz-Jimenez, Rodolphe Jenatton, Mark Collier, Efi, Kokiopoulou, Pascal Frossard

PDF

Open Access

TL;DR

Pi-DUAL is a novel architecture that leverages privileged information during training to effectively distinguish clean from noisy labels, significantly improving accuracy and noise detection in deep learning models.

Contribution

Introduces Pi-DUAL, a new method that uses privileged information to separate clean and noisy labels, outperforming existing approaches in accuracy and noise identification.

Findings

01

Achieves +6.8% accuracy on ImageNet-PI benchmark.

02

Outperforms other methods in noisy sample identification.

03

Establishes new state-of-the-art in label noise mitigation.

Abstract

Label noise is a pervasive problem in deep learning that often compromises the generalization performance of trained models. Recently, leveraging privileged information (PI) -- information available only during training but not at test time -- has emerged as an effective approach to mitigate this issue. Yet, existing PI-based methods have failed to consistently outperform their no-PI counterparts in terms of preventing overfitting to label noise. To address this deficiency, we introduce Pi-DUAL, an architecture designed to harness PI to distinguish clean from wrong labels. Pi-DUAL decomposes the output logits into a prediction term, based on conventional input features, and a noise-fitting term influenced solely by PI. A gating mechanism steered by PI adaptively shifts focus between these terms, allowing the model to implicitly separate the learning paths of clean and wrong labels.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Infrastructure Maintenance and Monitoring · Adversarial Robustness in Machine Learning

MethodsFocus