Clustering-Based Interpretation of Deep ReLU Network
Nicola Picchiotti, Marco Gori

TL;DR
This paper introduces a clustering-based interpretability method for deep ReLU networks, revealing that within each cluster, the network acts as an affine map, thus enhancing understanding of model predictions without changing the network structure.
Contribution
The paper presents a novel approach that exploits ReLU-induced clustering to interpret neural network predictions, providing feature importance explanations post-training.
Findings
Method effectively explains predictions within clusters
Increases interpretability of ReLU networks without structural changes
Demonstrated on Titanic dataset and simulated data
Abstract
Amongst others, the adoption of Rectified Linear Units (ReLUs) is regarded as one of the ingredients of the success of deep learning. ReLU activation has been shown to mitigate the vanishing gradient issue, to encourage sparsity in the learned parameters, and to allow for efficient backpropagation. In this paper, we recognize that the non-linear behavior of the ReLU function gives rise to a natural clustering when the pattern of active neurons is considered. This observation helps to deepen the learning mechanism of the network; in fact, we demonstrate that, within each cluster, the network can be fully represented as an affine map. The consequence is that we are able to recover an explanation, in the form of feature importance, for the predictions done by the network to the instances belonging to the cluster. Therefore, the methodology we propose is able to increase the level of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
