When Optimizing $f$-divergence is Robust with Label Noise
Jiaheng Wei, Yang Liu

TL;DR
This paper investigates the robustness of $f$-divergence measures in classification tasks under label noise, providing theoretical analysis, practical fixes, and experimental validation for improved learning with noisy labels.
Contribution
It offers a theoretical framework for understanding when and how $f$-divergences are robust to label noise, and proposes methods to enhance their robustness.
Findings
Certain $f$-divergences are inherently robust to label noise.
A decoupling property links divergence to clean and noisy label distributions.
Proposed fixes improve robustness of non-robust divergence measures.
Abstract
We show when maximizing a properly defined -divergence measure with respect to a classifier's predictions and the supervised labels is robust with label noise. Leveraging its variational form, we derive a nice decoupling property for a family of -divergence measures when label noise presents, where the divergence is shown to be a linear combination of the variational difference defined on the clean distribution and a bias term introduced due to the noise. The above derivation helps us analyze the robustness of different -divergence functions. With established robustness, this family of -divergence functions arises as useful metrics for the problem of learning with noisy labels, which do not require the specification of the labels' noise rate. When they are possibly not robust, we propose fixes to make them so. In addition to the analytical results, we present thorough…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Advanced Statistical Methods and Models
