When Can We Trust Deep Neural Networks? Towards Reliable Industrial Deployment with an Interpretability Guide
Hang-Cheng Dong, Yuhao Jiang, Yibo Jiao, Lu Zou, Kai Zheng, Bingguo Liu, Dong Ye, Guodong Liu

TL;DR
This paper introduces a novel post-hoc explanation-based indicator to detect false negatives in industrial defect detection networks, enhancing reliability for safety-critical AI applications.
Contribution
It presents the first method to proactively identify potentially erroneous network outputs using heatmap differences, with an adversarial enhancement to improve detection.
Findings
Effectively identifies false negatives in defect detection benchmarks.
Achieves 100% recall with adversarial enhancement.
Supports reliable AI deployment in safety-critical domains.
Abstract
The deployment of AI systems in safety-critical domains, such as industrial defect inspection, autonomous driving, and medical diagnosis, is severely hampered by their lack of reliability. A single undetected erroneous prediction can lead to catastrophic outcomes. Unfortunately, there is often no alternative but to place trust in the outputs of a trained AI system, which operates without an internal safeguard to flag unreliable predictions, even in cases of high accuracy. We propose a post-hoc explanation-based indicator to detect false negatives in binary defect detection networks. To our knowledge, this is the first method to proactively identify potentially erroneous network outputs. Our core idea leverages the difference between class-specific discriminative heatmaps and class-agnostic ones. We compute the difference in their intersection over union (IoU) as a reliability score. An…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
