Learning how to explain neural networks: PatternNet and PatternAttribution
Pieter-Jan Kindermans, Kristof T. Sch\"utt, Maximilian Alber,, Klaus-Robert M\"uller, Dumitru Erhan, Been Kim, Sven D\"ahne

TL;DR
This paper introduces PatternNet and PatternAttribution, two explanation techniques for neural networks that are theoretically sound for linear models and provide improved explanations for deep networks.
Contribution
It proposes a new generalization of explanation methods that are reliable for linear models and effective for deep neural networks, addressing issues with previous methods.
Findings
Existing explanation methods are not theoretically correct for linear models.
PatternNet and PatternAttribution are theoretically sound for linear models.
The new methods produce improved explanations for deep neural networks.
Abstract
DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with millions of parameters. This is a cause for concern since linear models are simple neural networks. We argue that explanation methods for neural nets should work reliably in the limit of simplicity, the linear models. Based on our analysis of linear models we propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
