Calibration improves detection of mislabeled examples

Ilies Chibane; Thomas George; Pierre Nodet; Vincent Lemaire

arXiv:2511.02738·cs.LG·November 5, 2025

Calibration improves detection of mislabeled examples

Ilies Chibane, Thomas George, Pierre Nodet, Vincent Lemaire

PDF

Open Access

TL;DR

This paper demonstrates that calibrating the base model in mislabel detection significantly enhances the accuracy and robustness of identifying mislabeled data, offering a practical solution for industrial machine learning systems.

Contribution

The study shows that calibration methods improve the effectiveness of mislabel detection by enhancing the trust scores derived from the base model.

Findings

01

Calibration improves detection accuracy

02

Calibration increases robustness against mislabeled data

03

Empirical results confirm practical benefits

Abstract

Mislabeled data is a pervasive issue that undermines the performance of machine learning systems in real-world applications. An effective approach to mitigate this problem is to detect mislabeled instances and subject them to special treatment, such as filtering or relabeling. Automatic mislabeling detection methods typically rely on training a base machine learning model and then probing it for each instance to obtain a trust score that each provided label is genuine or incorrect. The properties of this base model are thus of paramount importance. In this paper, we investigate the impact of calibrating this model. Our empirical results show that using calibration methods improves the accuracy and robustness of mislabeled instance detection, providing a practical and effective solution for industrial applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Software Testing and Debugging Techniques · Adversarial Robustness in Machine Learning