Perturbation on Feature Coalition: Towards Interpretable Deep Neural Networks
Xuran Hu, Mingzhe Zhu, Zhenpeng Feng, Milo\v{s} Dakovi\'c, Ljubi\v{s}a, Stankovi\'c

TL;DR
This paper introduces a perturbation-based interpretability method for deep neural networks that considers feature dependencies through feature coalitions, improving transparency and reliability.
Contribution
It proposes a novel feature coalition-guided perturbation approach with a consistency loss to enhance DNN interpretability, addressing limitations of existing methods.
Findings
Effective in capturing feature dependencies
Improves interpretability of DNNs
Validated through extensive experiments
Abstract
The inherent "black box" nature of deep neural networks (DNNs) compromises their transparency and reliability. Recently, explainable AI (XAI) has garnered increasing attention from researchers. Several perturbation-based interpretations have emerged. However, these methods often fail to adequately consider feature dependencies. To solve this problem, we introduce a perturbation-based interpretation guided by feature coalitions, which leverages deep information of network to extract correlated features. Then, we proposed a carefully-designed consistency loss to guide network interpretation. Both quantitative and qualitative experiments are conducted to validate the effectiveness of our proposed method. Code is available at github.com/Teriri1999/Perturebation-on-Feature-Coalition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
MethodsSoftmax · Attention Is All You Need
