Improving Model Safety by Targeted Error Correction

Abolfazl Mohammadi-Seif; Ricardo Baeza-Yates

arXiv:2605.02544·cs.AI·May 5, 2026

Improving Model Safety by Targeted Error Correction

Abolfazl Mohammadi-Seif, Ricardo Baeza-Yates

PDF

TL;DR

This paper presents a dual-classifier GBDT pipeline that improves safety in machine learning applications by reducing high-risk errors with minimal latency increase, validated across multiple medical and animal classification domains.

Contribution

The authors introduce a novel post-hoc correction method using a dual-classifier GBDT pipeline that significantly reduces dangerous errors without retraining models.

Findings

01

Reduces dangerous non-human errors by 34.1% in ISIC

02

Decreases errors by 12.57% in SICAPv2

03

Adds negligible inference latency (~1.7%)

Abstract

The widespread adoption of machine learning in critical applications demands techniques to mitigate high-consequence errors. Our method utilizes a dual-classifier GBDT pipeline to distinguish routine human-like errors from high-risk non-human misclassifications. Evaluated across three domains, animal breed classification, skin lesion diagnosis (ISIC 2018), and prostate histopathology (SICAPv2), our framework demonstrates robust safety improvements. To address real-world deployment concerns, our results confirm the pipeline introduces negligible inference latency (1.60% overhead for the animal dataset, 1.84% for ISIC, and 1.70% for SICAPv2) while outperforming traditional Maximum Class Probability (MCP) baselines in correction precision. Our conservative correction strategy successfully reduced dangerous non-human errors by 34.1% in ISIC and 12.57% in SICAPv2, improving super-class…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.