Human-in-the-loop model explanation via verbatim boundary identification in generated neighborhoods
Xianlong Zeng, Fanghao Song, Zhongen Li, Krerkkiat Chusap, Chang Liu

TL;DR
This paper introduces a human-in-the-loop method that generates and classifies synthetic neighborhoods around an instance to explain black-box models, enhancing interpretability through human refinement.
Contribution
It presents a novel three-stage approach combining neighborhood generation, classification, and human interaction to produce local decision boundary explanations for black-box models.
Findings
Effective in generating local decision boundaries
Enhances human understanding of model behavior
Demonstrated on two datasets with promising results
Abstract
The black-box nature of machine learning models limits their use in case-critical applications, raising faithful and ethical concerns that lead to trust crises. One possible way to mitigate this issue is to understand how a (mispredicted) decision is carved out from the decision boundary. This paper presents a human-in-the-loop approach to explain machine learning models using verbatim neighborhood manifestation. Contrary to most of the current eXplainable Artificial Intelligence (XAI) systems, which provide hit-or-miss approximate explanations, our approach generates the local decision boundary of the given instance and enables human intelligence to conclude the model behavior. Our method can be divided into three stages: 1) a neighborhood generation stage, which generates instances based on the given sample; 2) a classification stage, which yields classifications on the generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare · Adversarial Robustness in Machine Learning
