A Comprehensive Survey on Self-Interpretable Neural Networks

Yang Ji; Ying Sun; Yuting Zhang; Zhigaoyuan Wang; Yuanxin Zhuang; Zheng Gong; Dazhong Shen; Chuan Qin; Hengshu Zhu; and Hui Xiong

arXiv:2501.15638·cs.LG·November 21, 2025·2 cites

A Comprehensive Survey on Self-Interpretable Neural Networks

Yang Ji, Ying Sun, Yuting Zhang, Zhigaoyuan Wang, Yuanxin Zhuang, Zheng Gong, Dazhong Shen, Chuan Qin, Hengshu Zhu, and Hui Xiong

PDF

Open Access 1 Repo

TL;DR

This survey comprehensively reviews self-interpretable neural networks, categorizing methodologies, providing visual examples, discussing applications, evaluation metrics, and outlining future challenges in the field.

Contribution

It systematically summarizes existing self-interpretable neural network methods, introduces a structured taxonomy, and offers a resource for tracking ongoing research developments.

Findings

01

Categorized self-interpretable methods into five key perspectives.

02

Provided visualized examples of model explanations across domains.

03

Identified open challenges and future directions in self-interpretability.

Abstract

Neural networks have achieved remarkable success across various fields. However, the lack of interpretability limits their practical use, particularly in critical decision-making scenarios. Post-hoc interpretability, which provides explanations for pre-trained models, is often at risk of robustness and fidelity. This has inspired a rising interest in self-interpretable neural networks, which inherently reveal the prediction rationale through the model structures. Although there exist surveys on post-hoc interpretability, a comprehensive and systematic survey of self-interpretable neural networks is still missing. To address this gap, we first collect and review existing works on self-interpretable neural networks and provide a structured summary of their methodologies from five key perspectives: attribution-based, function-based, concept-based, prototype-based, and rule-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yangji721/awesome-self-interpretable-neural-network
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Neural Networks and Applications · Explainable Artificial Intelligence (XAI)