CONFINE: Conformal Prediction for Interpretable Neural Networks
Linhui Huang, Sayeri Lala, Niraj K. Jha

TL;DR
CONFINE introduces a framework that enhances neural network interpretability by providing prediction sets with quantifiable uncertainty, improving transparency and reliability in critical applications like healthcare.
Contribution
It presents a novel conformal prediction framework that offers uncertainty estimates and explanations for neural networks, improving transparency without sacrificing accuracy.
Findings
Achieves up to 3.6% accuracy boost.
Correct efficiency up to 3.3% higher than accuracy.
Valid across medical imaging and language tasks.
Abstract
Deep neural networks exhibit remarkable performance, yet their black-box nature limits their utility in fields like healthcare where interpretability is crucial. Existing explainability approaches often sacrifice accuracy and lack quantifiable measures of prediction uncertainty. In this study, we introduce Conformal Prediction for Interpretable Neural Networks (CONFINE), a versatile framework that generates prediction sets with statistically robust uncertainty estimates instead of point predictions to enhance model transparency and reliability. CONFINE not only provides example-based explanations and confidence estimates for individual predictions but also boosts accuracy by up to 3.6%. We define a new metric, correct efficiency, to evaluate the fraction of prediction sets that contain precisely the correct label and show that CONFINE achieves correct efficiency of up to 3.3% higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Explainable Artificial Intelligence (XAI)
