Regroup Median Loss for Combating Label Noise
Fengpeng Li, Kemou Li, Jinyu Tian, Jiantao Zhou

TL;DR
This paper introduces Regroup Median Loss (RML), a novel loss function and sample selection strategy that effectively reduces the impact of label noise in training deep models, improving performance on noisy datasets.
Contribution
The paper proposes RML, a new loss and sampling method that better handles label noise, along with a semi-supervised approach, outperforming existing methods.
Findings
RML significantly improves accuracy on noisy datasets.
The semi-supervised RML method enhances model robustness.
Source code is publicly available.
Abstract
The deep model training procedure requires large-scale datasets of annotated data. Due to the difficulty of annotating a large number of samples, label noise caused by incorrect annotations is inevitable, resulting in low model performance and poor model generalization. To combat label noise, current methods usually select clean samples based on the small-loss criterion and use these samples for training. Due to some noisy samples similar to clean ones, these small-loss criterion-based methods are still affected by label noise. To address this issue, in this work, we propose Regroup Median Loss (RML) to reduce the probability of selecting noisy samples and correct losses of noisy samples. RML randomly selects samples with the same label as the training samples based on a new loss processing method. Then, we combine the stable mean loss and the robust median loss through a proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Infrastructure Maintenance and Monitoring · Industrial Vision Systems and Defect Detection
