Leveraging an Alignment Set in Tackling Instance-Dependent Label Noise

Donna Tjandra; Jenna Wiens

arXiv:2307.04868·cs.LG·July 12, 2023

Leveraging an Alignment Set in Tackling Instance-Dependent Label Noise

Donna Tjandra, Jenna Wiens

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-stage method leveraging a small set of known labels to improve model accuracy and reduce bias in datasets with instance-dependent label noise, especially in healthcare applications.

Contribution

The paper proposes a novel two-stage approach using anchor points to effectively handle instance-dependent label noise, improving discriminative performance and fairness.

Findings

01

Improves AUROC from 0.81 to 0.84 on MIMIC-III dataset

02

Reduces bias as measured by AUEOC

03

Outperforms state-of-the-art methods in noisy label settings

Abstract

Noisy training labels can hurt model performance. Most approaches that aim to address label noise assume label noise is independent from the input features. In practice, however, label noise is often feature or \textit{instance-dependent}, and therefore biased (i.e., some instances are more likely to be mislabeled than others). E.g., in clinical care, female patients are more likely to be under-diagnosed for cardiovascular disease compared to male patients. Approaches that ignore this dependence can produce models with poor discriminative performance, and in many healthcare settings, can exacerbate issues around health disparities. In light of these limitations, we propose a two-stage approach to learn in the presence instance-dependent label noise. Our approach utilizes \textit{\anchor points}, a small subset of data for which we know the observed and ground truth labels. On several…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MLD3/Instance_Dependent_Label_Noise
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning in Healthcare · Human Pose and Action Recognition