Filtering instances and rejecting predictions to obtain reliable models in healthcare

Maria Gabriela Valeriano; David Kohan Marzag\~ao; Alfredo Montelongo; Carlos Roberto Veiga Kiffer; Natan Katz; Ana Carolina Lorena

arXiv:2510.24368·cs.LG·October 29, 2025

Filtering instances and rejecting predictions to obtain reliable models in healthcare

Maria Gabriela Valeriano, David Kohan Marzag\~ao, Alfredo Montelongo, Carlos Roberto Veiga Kiffer, Natan Katz, Ana Carolina Lorena

PDF

TL;DR

This paper presents a two-step data-centric method combining instance filtering and confidence-based rejection to improve the reliability of machine learning models in healthcare, ensuring safer and more trustworthy predictions.

Contribution

It introduces a novel approach that integrates Instance Hardness filtering with confidence-based rejection to enhance model reliability in healthcare applications.

Findings

01

Improved model reliability with high rejection rates.

02

Effective filtering of problematic instances during training.

03

Enhanced prediction confidence in real-world healthcare datasets.

Abstract

Machine Learning (ML) models are widely used in high-stakes domains such as healthcare, where the reliability of predictions is critical. However, these models often fail to account for uncertainty, providing predictions even with low confidence. This work proposes a novel two-step data-centric approach to enhance the performance of ML models by improving data quality and filtering low-confidence predictions. The first step involves leveraging Instance Hardness (IH) to filter problematic instances during training, thereby refining the dataset. The second step introduces a confidence-based rejection mechanism during inference, ensuring that only reliable predictions are retained. We evaluate our approach using three real-world healthcare datasets, demonstrating its effectiveness at improving model reliability while balancing predictive performance and rejection rate. Additionally, we use…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.