Not All Features Are Equal: Feature Leveling Deep Neural Networks for Better Interpretation
Yingjing Lu, Runde Yang

TL;DR
This paper introduces a feature leveling architecture for deep neural networks that isolates features at different levels to enhance interpretability without sacrificing performance.
Contribution
The paper proposes a novel feature leveling architecture that separates low and high level features per layer to improve interpretability of DNNs.
Findings
Achieves competitive accuracy on standard datasets
Enhances model interpretability through feature separation
Publicly available implementation for reproducibility
Abstract
Self-explaining models are models that reveal decision making parameters in an interpretable manner so that the model reasoning process can be directly understood by human beings. General Linear Models (GLMs) are self-explaining because the model weights directly show how each feature contributes to the output value. However, deep neural networks (DNNs) are in general not self-explaining due to the non-linearity of the activation functions, complex architectures, obscure feature extraction and transformation process. In this work, we illustrate the fact that existing deep architectures are hard to interpret because each hidden layer carries a mix of low level features and high level features. As a solution, we propose a novel feature leveling architecture that isolates low level features from high level features on a per-layer basis to better utilize the GLM layer in the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
