Optimized Gradient Clipping for Noisy Label Learning
Xichen Ye, Yifan Wu, Weizhong Zhang, Xiaoqiang Li, Yifan Chen, and, Cheng Jin

TL;DR
This paper introduces Optimized Gradient Clipping (OGC), a dynamic method that adjusts gradient clipping thresholds during training to improve robustness against various noisy labels, outperforming fixed-threshold methods.
Contribution
We propose a novel dynamic gradient clipping approach that adapts thresholds based on noise and clean gradient ratios, enhancing noise robustness in label learning.
Findings
OGC improves robustness across different noise types.
Dynamic threshold adjustment outperforms fixed thresholds.
Statistical analysis certifies noise-tolerance of OGC.
Abstract
Previous research has shown that constraining the gradient of loss function with respect to model-predicted probabilities can enhance the model robustness against noisy labels. These methods typically specify a fixed optimal threshold for gradient clipping through validation data to obtain the desired robustness against noise. However, this common practice overlooks the dynamic distribution of gradients from both clean and noisy-labeled samples at different stages of training, significantly limiting the model capability to adapt to the variable nature of gradients throughout the training process. To address this issue, we propose a simple yet effective approach called Optimized Gradient Clipping (OGC), which dynamically adjusts the clipping threshold based on the ratio of noise gradients to clean gradients after clipping, estimated by modeling the distributions of clean and noisy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTransport Systems and Technology
MethodsGradient Clipping
