Detecting Label Noise via Leave-One-Out Cross-Validation
Yu-Hang Tang, Yuanran Zhu, Wibe A. de Jong

TL;DR
This paper introduces a Gaussian process regression-based algorithm that detects and corrects noisy labels in datasets, improving model accuracy by identifying corrupted samples through leave-one-out cross-validation.
Contribution
The paper proposes a novel heteroscedastic noise modeling approach with a multiplicative update scheme for effective label noise detection and correction in regression tasks.
Findings
Successfully identifies corrupted labels in synthetic and real data
Enhances regression accuracy by removing noisy labels
Demonstrates convergence and robustness of the proposed method
Abstract
We present a simple algorithm for identifying and correcting real-valued noisy labels from a mixture of clean and corrupted sample points using Gaussian process regression. A heteroscedastic noise model is employed, in which additive Gaussian noise terms with independent variances are associated with each and all of the observed labels. Optimizing the noise model using maximum likelihood estimation leads to the containment of the GPR model's predictive error by the posterior standard deviation in leave-one-out cross-validation. A multiplicative update scheme is proposed for solving the maximum likelihood estimation problem under non-negative constraints. While we provide proof of convergence for certain special cases, the multiplicative scheme has empirically demonstrated monotonic convergence behavior in virtually all our numerical experiments. We show that the presented method can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Gaussian Processes and Bayesian Inference
MethodsGaussian Process
