Task-Agnostic Noisy Label Detection via Standardized Loss Aggregation
Inhyuk Park, Doohyun Park

TL;DR
The paper introduces Standardized Loss Aggregation (SLA), a task-agnostic, statistically grounded framework for detecting noisy labels in large-scale datasets by aggregating validation losses across cross-validation runs.
Contribution
It presents a novel continuous estimator for label noisiness that generalizes hard-counting schemes, providing interpretable and statistically stable scores.
Findings
SLA outperforms baseline methods across all noise levels.
SLA converges faster, especially under low noise ratios.
High SLA scores correlate with ambiguous or mislabeled samples.
Abstract
Noisy labels are common in large-scale medical imaging datasets due to inter-observer variability and ambiguous cases. We propose a statistically grounded and task-agnostic framework, Standardized Loss Aggregation (SLA), for detecting noisy labels at the sample level. SLA quantifies label reliability by aggregating standardized fold-level validation losses across repeated cross-validation runs. This formulation generalizes discrete hard-counting schemes into a continuous estimator that captures both the frequency and magnitude of performance deviations, yielding interpretable and statistically stable noisiness scores. Experiments on a public fundus dataset demonstrate that SLA consistently outperforms the hard-counting baseline across all noise levels and converges substantially faster, especially under low noise ratios where subtle loss variations are informative. Samples with high SLA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
