Improving Network Interpretability via Explanation Consistency Evaluation
Hefeng Wu, Hao Jiang, Keze Wang, Ziyi Tang, Xianghuan He, Liang Lin

TL;DR
This paper introduces a simple framework that enhances neural network interpretability and performance by using explanation consistency to reweight training samples, improving robustness, accuracy, and localization without extra supervision.
Contribution
The paper proposes a novel explanation consistency metric and a reweighting method that improves both interpretability and accuracy without additional supervision.
Findings
Higher recognition accuracy across benchmarks
Enhanced data debiasing and robustness
More precise localization ability
Abstract
While deep neural networks have achieved remarkable performance, they tend to lack transparency in prediction. The pursuit of greater interpretability in neural networks often results in a degradation of their original performance. Some works strive to improve both interpretability and performance, but they primarily depend on meticulously imposed conditions. In this paper, we propose a simple yet effective framework that acquires more explainable activation heatmaps and simultaneously increase the model performance, without the need for any extra supervision. Specifically, our concise framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. The explanation consistency metric is utilized to measure the similarity between the model's visual explanations of the original samples and those of semantic-preserved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsSoftmax · Attention Is All You Need
