A Self-explaining Neural Architecture for Generalizable Concept Learning
Sanchit Sinha, Guangzhi Xiong, Aidong Zhang

TL;DR
This paper introduces a self-explaining neural architecture that enhances concept learning by improving interpretability, domain generalization, and concept fidelity through novel modules and regularization techniques, validated on multiple datasets.
Contribution
The paper proposes a new architecture with a concept saliency network, contrastive learning, and prototype-based regularization to address fidelity and interoperability issues in concept learning.
Findings
Improves concept overlap and fidelity.
Enhances domain adaptation performance.
Outperforms state-of-the-art methods on real-world datasets.
Abstract
With the wide proliferation of Deep Neural Networks in high-stake applications, there is a growing demand for explainability behind their decision-making process. Concept learning models attempt to learn high-level 'concepts' - abstract entities that align with human understanding, and thus provide interpretability to DNN architectures. However, in this paper, we demonstrate that present SOTA concept learning approaches suffer from two major problems - lack of concept fidelity wherein the models fail to learn consistent concepts among similar classes and limited concept interoperability wherein the models fail to generalize learned concepts to new domains for the same task. Keeping these in mind, we propose a novel self-explaining architecture for concept learning across domains which - i) incorporates a new concept saliency network for representative concept selection, ii) utilizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsContrastive Learning · ALIGN
