Explaining Deep Neural Networks by Leveraging Intrinsic Methods
Biagio La Rosa

TL;DR
This paper advances explainable AI by developing methods to interpret deep neural networks through self-explanatory designs, neuron analysis, and evaluating explanatory techniques in visual analytics.
Contribution
It introduces new interpretability techniques including self-explanatory architectures, neuron analysis, and assessment of explanation methods in visual analytics.
Findings
Proposed self-explanatory neural network designs with external memory and prototype layers
Uncovered new insights into neuron activation phenomena
Evaluated the effectiveness of explanation techniques in visual analytics systems
Abstract
Despite their impact on the society, deep neural networks are often regarded as black-box models due to their intricate structures and the absence of explanations for their decisions. This opacity poses a significant challenge to AI systems wider adoption and trustworthiness. This thesis addresses this issue by contributing to the field of eXplainable AI, focusing on enhancing the interpretability of deep neural networks. The core contributions lie in introducing novel techniques aimed at making these networks more interpretable by leveraging an analysis of their inner workings. Specifically, the contributions are threefold. Firstly, the thesis introduces designs for self-explanatory deep neural networks, such as the integration of external memory for interpretability purposes and the usage of prototype and constraint-based layers across several domains. Secondly, this research delves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
