Learning from Noisy Labels via Conditional Distributionally Robust   Optimization

Hui Guo; Grace Y. Yi; Boyu Wang

arXiv:2411.17113·cs.LG·November 27, 2024

Learning from Noisy Labels via Conditional Distributionally Robust Optimization

Hui Guo, Grace Y. Yi, Boyu Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel robust learning framework using conditional distributionally robust optimization (CDRO) to handle noisy labels in crowdsourced datasets, improving model accuracy under high-noise conditions.

Contribution

It formulates a new CDRO-based approach with a robust pseudo-labeling algorithm and provides analytical solutions for efficient optimization, addressing label noise misspecification.

Findings

01

Outperforms existing methods on synthetic datasets.

02

Demonstrates robustness in real-world noisy label scenarios.

03

Provides theoretical guarantees for the proposed approach.

Abstract

While crowdsourcing has emerged as a practical solution for labeling large datasets, it presents a significant challenge in learning accurate models due to noisy labels from annotators with varying levels of expertise. Existing methods typically estimate the true label posterior, conditioned on the instance and noisy annotations, to infer true labels or adjust loss functions. These estimates, however, often overlook potential misspecification in the true label posterior, which can degrade model performances, especially in high-noise scenarios. To address this issue, we investigate learning from noisy annotations with an estimated true label posterior through the framework of conditional distributionally robust optimization (CDRO). We propose formulating the problem as minimizing the worst-case risk within a distance-based ambiguity set centered around a reference distribution. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hguo1728/AdaptCDRP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Process Monitoring

MethodsSparse Evolutionary Training