Hard Samples, Bad Labels: Robust Loss Functions That Know When to Back Off

Nicholas Pellegrino; David Szczecina; Paul Fieguth

arXiv:2511.16512·cs.LG·November 27, 2025

Hard Samples, Bad Labels: Robust Loss Functions That Know When to Back Off

Nicholas Pellegrino, David Szczecina, Paul Fieguth

PDF

Open Access

TL;DR

This paper introduces two novel loss functions, Blurry Loss and Piecewise-zero Loss, designed to improve robustness against label errors by de-emphasizing difficult samples, thereby enhancing error detection and data cleaning in supervised learning.

Contribution

The paper proposes two new loss functions that improve robustness to label errors and outperform existing methods in error detection across various datasets.

Findings

01

Outperform state-of-the-art robust loss functions in error detection

02

Effective across both uniform and non-uniform label corruption

03

Enhance data cleaning by better identifying mislabeled samples

Abstract

Incorrectly labelled training data are frustratingly ubiquitous in both benchmark and specially curated datasets. Such mislabelling clearly adversely affects the performance and generalizability of models trained through supervised learning on the associated datasets. Frameworks for detecting label errors typically require well-trained / well-generalized models; however, at the same time most frameworks rely on training these models on corrupt data, which clearly has the effect of reducing model generalizability and subsequent effectiveness in error detection -- unless a training scheme robust to label errors is employed. We evaluate two novel loss functions, Blurry Loss and Piecewise-zero Loss, that enhance robustness to label errors by de-weighting or disregarding difficult-to-classify samples, which are likely to be erroneous. These loss functions leverage the idea that mislabelled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Adversarial Robustness in Machine Learning