Ensemble Learning with Manifold-Based Data Splitting for Noisy Label Correction
Hao-Chiang Shao, Hsin-Chieh Wang, Weng-Tai Su, and Chia-Wen Lin

TL;DR
This paper introduces an ensemble learning approach that leverages local manifold structures to identify and correct noisy labels concentrated near decision boundaries, improving model robustness.
Contribution
The method uniquely uses disjoint subsets based on local feature neighborhoods to enhance label correction in noisy datasets, differing from traditional ensemble strategies.
Findings
Outperforms existing methods on real-world noisy datasets.
Effectively identifies and corrects labels near decision boundaries.
Reduces impact of noisy labels on model training.
Abstract
Label noise in training data can significantly degrade a model's generalization performance for supervised learning tasks. Here we focus on the problem that noisy labels are primarily mislabeled samples, which tend to be concentrated near decision boundaries, rather than uniformly distributed, and whose features should be equivocal. To address the problem, we propose an ensemble learning method to correct noisy labels by exploiting the local structures of feature manifolds. Different from typical ensemble strategies that increase the prediction diversity among sub-models via certain loss terms, our method trains sub-models on disjoint subsets, each being a union of the nearest-neighbors of randomly selected seed samples on the data manifold. As a result, each sub-model can learn a coarse representation of the data manifold along with a corresponding graph. Moreover, only a limited…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Infrastructure Maintenance and Monitoring
