When Can We Trust Deep Neural Networks? Towards Reliable Industrial Deployment with an Interpretability Guide

Hang-Cheng Dong; Yuhao Jiang; Yibo Jiao; Lu Zou; Kai Zheng; Bingguo Liu; Dong Ye; Guodong Liu

arXiv:2604.19206·cs.CV·April 22, 2026

When Can We Trust Deep Neural Networks? Towards Reliable Industrial Deployment with an Interpretability Guide

Hang-Cheng Dong, Yuhao Jiang, Yibo Jiao, Lu Zou, Kai Zheng, Bingguo Liu, Dong Ye, Guodong Liu

PDF

TL;DR

This paper introduces a novel post-hoc explanation-based indicator to detect false negatives in industrial defect detection networks, enhancing reliability for safety-critical AI applications.

Contribution

It presents the first method to proactively identify potentially erroneous network outputs using heatmap differences, with an adversarial enhancement to improve detection.

Findings

01

Effectively identifies false negatives in defect detection benchmarks.

02

Achieves 100% recall with adversarial enhancement.

03

Supports reliable AI deployment in safety-critical domains.

Abstract

The deployment of AI systems in safety-critical domains, such as industrial defect inspection, autonomous driving, and medical diagnosis, is severely hampered by their lack of reliability. A single undetected erroneous prediction can lead to catastrophic outcomes. Unfortunately, there is often no alternative but to place trust in the outputs of a trained AI system, which operates without an internal safeguard to flag unreliable predictions, even in cases of high accuracy. We propose a post-hoc explanation-based indicator to detect false negatives in binary defect detection networks. To our knowledge, this is the first method to proactively identify potentially erroneous network outputs. Our core idea leverages the difference between class-specific discriminative heatmaps and class-agnostic ones. We compute the difference in their intersection over union (IoU) as a reliability score. An…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.